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TITLE : Designing Modulators for Galactosyltransferases 
FIELD OF THE INVENTION 

The invention relates to structures and models of ligand binding domains of galactosyltransferases, and the 
ligand binding domains with ligands. The structural coordinates that define the structures and any ligands bound to 
5 the structures enable the determination of homologues, the structures of polypeptides with unknown structure, and 
the identification of modulators of the galactosyltransferases. The invention also relates, to structures and models of 
nucleotide-sugar donors for the galactosyltransferases, and the design of modulators for the galactosyltransferases 
based on the properties of these structures and models. 
BACKGROUND OF THE INVENTION 

10 . Carbohydrate groups of glycoproteins are involved in various signaling and molecular recognition 

processes leading to important biological functions (1) and diseases (2). The processing and synthesis of a large 
number of both N- and 0- linked carbohydrate chains involve the sequential and coordinated action of many 
different glycosyitransferases. Gly cosy ltransf erases catalyze the transfer of monosaccharide from nucleotide sugars . 
to a specific hydroxyl of various saccharide acceptors that leads to the formation of a new glycosidic linkage. There 

15 is at least one distinct glycosyltransferase for every type of glycosidic linkage. 

Galactosyltransferases are a class of enzymes that utilize uridine-5'-diphosphogalactose (UDP-Gal) as the 
donor. Recently, a retaining galactosyltransferase, a-l,3-galactosyltransferase (ct-l-3GalT; E.C.2.4.1.151) (4) has 
attracted much attention due to a problem of organ rejection in xenotransplantation. This enzyme is responsible for 
the formation of terminal a-Gal sequences in GakxI-3 Galpl- GIcNAcal-R. Oligosaccharide structures with a 

20 terminal Galal-3Gaip sequence (a-galactosyl epitopes) are xenoactive antigens (5) and are considered to be the 
major cause of hyperacute rejections in xenotransplantation, a- 1,3 -Galactosyltransferase is absent in humans and, 
conversely, large quantities of natural anti~a-l ,3-Gal antibodies exist in the human body which react with the a-Gal 
epitope, thus providing a barrier to xenotransplant. The appearance of aberrant a-l,3-GalT in human cells is 
assumed to be responsible for some forms of anti-immune diseases (6). 

25 Galactosyltransferases share a. common topology with type II membrane proteins: Type II membrane 

proteins generally have a large N-terminal catalytic domain, a short stem region and a hydrophobic rich 
transmembrane domain (3). Although, various groups have performed a host of biochemical studies on this enzyme 
to understand structure-function relationships, the actual binding and catalytic mechanism of oc-l,3-GalT is poorly 
understood. For an understanding of these important aspects in atomic detail it is essential to have a three- 

30 dimensional structure of ot-l,3-GalT and structural information about the binding of UDP-Gal and oligosaccharide 
acceptor in the active site of ct-l,3-GalT. Unfortunately, no crystal structure is available on a-l,3-GalT in native or 
complexed form. 

SUMMARY OF THE INVENTION 

The present inventors have produced a homology model for galactosyltransferases, and complexes of the 
35 enzymes with ligands including UDP and UDP-Gal. The homology model was developed by means of molecular 
modeling using the SpsA glycosyltransferase structure. In particular, a protein-ligand docking approach was used to 
model a-l,3-GalT complexed with UDP and UDP-Gal. In the predicted model complex, the diphosphate interacts 
with a DVD motif (Asp-225, Val-226 and Asp-227) of a-loGalT through a Mn 2+ cation. The uridine part of the 
UDP binds into the cavity that consists of Phe-134, Tyr-139, Ile-140, Val-136, Arg-194, Arg-202, Lys-209, Asp- 
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173, His-218, and Thr-137, in a "canonical conformation". Structural features of the o>l,3-GaTT model were 
compared with available structural data on this class of enzymes and revealed similarities in the UDP binding 
pocket. 

The invention provides a model or secondary, tertiary, and/or quanternary structure of a ligand binding 
5 domain of a galactosyltransf erase. Binding domains are of significant utility in drug discovery. The association of 
natural ligands and substrates with the binding domains of galactosyltransferases is the basis of biological 
mechanisms. The associations may occur with all or any parts of a binding domain. An understanding of these 
associations will lead to the design and optimization of drugs having more favorable associations with their target 
enzyme and thus provide improved biological effects. Therefore, information about the shape and structure of 
10 galactosyltransferases and their ligand-binding domains is invaluable in designing potential modulators of 
galactosyltransferases for use in treating diseases and conditions associated with or modulated by the 
galactosyltransferases. 

Ligand binding domains include one or more of the binding domains for a disphosphate group of a sugar 
nucleotide donor, a nucleotide of a sugar nucleotide donor, a nitrogeneous heterocyclic base (preferably a pyrimidine 
15 base, more preferably uracil) of a sugar nucleotide donor, a sugar of the nucleotide of a sugar nucleotide donor, a 
selected sugar of a sugar nucleotide donor that is transferred to an acceptor, and/or an acceptor. The structure of a 
ligand binding domain may be defined by selected binding sites in the domain. 

Thus, broadly stated the present invention provides a model or a secondary or three dimensional structure of 
a ligand binding domain of a galactosyltransferase comprising one or more of the amino acid residues shown in 
20 Table 1 or Figure 2, 3, 4, or 6. 

The invention also relates to a model or a secondary or three dimensional structure of a ligand binding 
domain of a galactosyltransferase defined by the structural coordinates of one or more of the atomic interactions or 
contacts of Table 1 . Each of the atomic interactions is defined in Table 1 by an atomic contact (more preferably a 
specific atom where indicated) on the sugar nucleotide donor and an atomic contact (more preferably a specific atom 
25 where indicated) on the galactosyltransferase. 

In accordance with an aspect of the invention, there is also provided a model of a ligand binding domain 
designed in accordance with a method of the invention and comprising hydrogen binding, partners for the amide 
hydrogen, carbonyl oxygen in position 4, and the carbonyl oxygen of uracil. 

The invention also provides a model of a ligand binding domain that binds the uridine portion of UDP and 
30 comprises two or more of Phe-134, Tyr-139, Ile-140, Val-136, Arg-194, Arg-202, Lys-209 (numbered as ATOM 
204 in Table 8), Asp- 173 (numbered as ATOM 169 in Table 8), His-218 (numbered as ATOM 213 in Table 8), and 
Thr-137 (numbered as ATOM 132 in Table 8). The invention also provides a model of a ligand binding domain that 
interacts with a pyrophosphate portion of UDP comprising Asp-225, Val-226, and Asp-227. 

The invention provides a model or secondary, tertiary and/or quanternary structure of a 
35 galactosyltransferase. 

The invention contemplates a model or secondary, tertiary and/or quanternary structure of a 
galactosyltransferase in association with a ligand or substrate. 

The structures and models of the invention provide information about the atomic contacts involved in the 
interaction between the enzyme and a known ligand which can be used to screen for unknown ligands. Therefore the 
40 present invention provides a method of screening for a ligand capable of binding a galactosyltransferase ligand 
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binding domain, comprising the use of a secondary or three-dimensional structure or a model of the invention. For 
example, the method may comprise the step of contacting a ligand binding domain with a test compound, and 
determining if the test compound binds to the ligand. 

A method of the invention may identify a ligand which can modulate the biological activity of a 
galactosyltransferase. Such a ligand is referred to herein -as a "modulator". In an embodiment, the present invention 
contemplates a method of identifying a modulator of a galactosyltransferase or a ligand binding domain or binding 
site thereof, comprising the step of using the structural coordinates of a galactosyltransferase or a ligand binding 
domain or binding site thereof, or a model of the invention to computationally evaluate a test compound for its 
ability to associate with the galactosyltransferase or ligand binding domain or binding site thereof. Use of the 
structural coordinates of a galactosyltransferase structure, ligand binding domain, or binding site thereof, of the 
invention to identify a ligand or modulator is also provided. 

A structure or model of the invention may be used to design, evaluate, and identify ligands of 
galactbsyltransferases other than ligands that associate with a galactosyltransferase. The ligands may be based on the 
shape and structure of a galactosyltransferase, or a ligand binding domain or atomic interactions, or atomic contacts 
thereof. Therefore, ligands, in particular modulators, may be derived from ligand binding domains or analogues or 
parts thereof. 

. The present invention also contemplates a ligand identified by a method of the invention. A ligand may be a 
competitive or non-competitive inhibitor of a galactosyltransferase. Preferably, the ligand is capable of modulating 
the activity of a galactosyltransferase enzyme. Thus the methods of the invention permit the identification early in 
the drug development cycle of compounds that have advantageous properties. 

In an embodiment of the invention, a method is provided for identifying a potential modulator of a 
galactosyltransferase by determining binding interactions between a test compound and atomic contacts of a binding 
domain of a galactosyltransferase defined in accordance with the invention comprising: 

(a) generating the atomic contacts on a computer screen; ( : 

(b) generating test compounds with their spatial structure on the computer screen; and 

(c) detennining whether the compounds associate or interact with the atomic contacts defining the 
galactosyltransferase; 

(d) identifying test compounds that are potential modulators by their ability to enter into a selected 
number of atomic contacts. 

Another aspect of the invention provides methods for identifying a potential modulator of a 
galactosyltransferase function by docking a computer representation of a test compound with a computer 
representation of a structure of a galactosyltransferase or a ligand binding domain thereof that is defined as described 
herein. In an embodiment the method comprises the following steps: 

(a) docking a computer representation of a compound from a computer data base with a computer 
representation of atomic interactions or contacts- of a ligand binding domain of a 
galactosyltransferase to obtain a complex; 

(b) detennining a conformation of the complex with a favourable geometric fit and favourable 
complementary interactions; and 

(c) identifying test compounds that best fit the atomic interactions or contacts as potential modulators 
of the galactosyltransferase. 
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In another embodiment the method comprises the following steps: 

(a) modifying a computer representation of a test compound complexed with a ligand binding domain 
of a galactosyltransferase by deleting or adding a chemical group or groups; 

(b) determining a conformation of the complex with a favourable geometric fit and favourable 
5 complementary interactions; and 

(c) identifying a test compound that best fits the ligand binding domain as a potential modulator of a 
galactosyltransferase. 

In still another embodiment the method comprises the following steps: 

(a) selecting a computer representation of a test compound complexed with atomic contacts of a 
10 binding domain of a galactosyltransferase; and 

(b) searching for molecules in a data base that are similar to the test compound using a searching 
computer program, or replacing portions of the test compound with similar chemical structures 
from a data base using a compound building computer program. 

The ligands or compounds identified according to the methods of the invention preferably have structures 

15 such that they are able to enter into an association with a ligand binding domain. Selected ligands or compounds may 
be characterized by their suitability for binding to particular binding domains. A ligand binding domain or binding 
site may be regarded as a type of negative template with which the compounds correlate as positives in the manner 
described herein and thus the compounds are unambiguously defined. Therefore, it is possible to describe the . 
structure of a compound suitable as a modulator of a galactosyltransferase by accurately defining the atomic 

20 interactions to which the compound binds to a ligand binding domain and deriving the structure of the compound 
from the spacial structure of the target. 

The invention contemplates a method for the design of ligands, in particular modulators, for 
galactosyltransferases based on the three dimensional structure of a sugar nucleotide donor (or part thereof) defined 
in relation to its spatial association with the three dimensional structure of the galactosyltransferase or a ligand 

25 binding domain thereof. Generally, a method is provided for designing potential inhibitors of a galactosyltransferase 
comprising the step of using the structural coordinates of a sugar nucleotide donor or part thereof, defined in relation 
to its spatial association with a three dimensional structure or model of a galactosyltransferase or a ligand binding 
domain thereof, to generate a compound for associating with a ligand binding domain of the galactosyltransferase. 
The following steps are employed in a particular method of the invention: (a) generating a computer representation 

30 of a sugar nucleotide donor, or part thereof, defined in relation to its spatial association with the three dimensional 
structure of a galactosyltransferase or a ligand binding domain thereof; (b) searching for molecules in a data base 
that are similar to the defined sugar nucleotide donor, or part thereof, using a searching computer program, or 
replacing portions of the compound with similar chemical structures from a database using a compound building 
computer program . 

35 Therefore - the invention further contemplates classes of ligands, in particular modulators, of a 

galactosyltransferase based on the three-dimensional structure of a sugar nucleotide donor, or part thereof, defined in 
relation to the sugar nucleotide donor's spatial association with a three dimensional structure of a 
galactosyltranferase. 
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it will be appreciated that a ligand or modulator of a galactosyltransferase may be identified by generating 
an actual secondary or three-dimensional model of a ligand binding domain or binding site, synthesizing a 
compound, and examining the components to find whether the required interaction occurs. 

Modulators which are capable of modulating the activity of galactosyltransferases have therapeutic and 
prophylactic potential. Therefore, the methods of the invention for identifying modulators may comprise one or more 
of the following additional steps: 

(a) testing whether the ligand is a modulator of the activity of a galactosyltransferase, preferably 
testing the activity of the modulator in cellular assays and animal model assays; 

(b) modifying the modulator; 

(c) optionally rerunning steps (a) or (b); and 

(d) preparing a pharmaceutical composition comprising the modulator. 

Steps (a), (b) (c) and. (d) may be carried out in any order, at different points in time, and they need not be sequential. 

There is also provided a pharmaceutical composition comprising a modulator, and a method of treating 
and/or preventing disease -comprising the step of administering a modulator or pharmaceutical composition 
comprising a modulator to a mammalian patient 

In an aspect, the invention contemplates a' method of treating a disease associated with a 
galactosyltransferase with inappropriate activity in a cellular organism, comprising: 

(a) administering a modulator identified using the methods of the invention in an acceptable 
pharmaceutical preparation; and 

(b) activating or inhibiting a galactosyltransferase to treat the disease. 

The invention provides for the use of a modulator identified by the methods of the invention in the 
preparation of a medicament to treat a disease associated with a galactosyltransferase with inappropriate activity in a 
cellular organism. Use of the structural coordinates of a galactosyltransferase structure of the invention to 
manufacture a medicament is also provided. 

Another aspect of the invention provides machine readable media encoded with data representing a model 
of the invention or the coordinates of a structure of a galactosyltransferase or ligand binding domain or binding site 
thereof as defined herein, or the three dimensional structure of a sugar nucleotide donor defined in relation to its 
spatial association with a three dimensional structure of a galactosyltransferase as. defined herein. The invention also 
provides computerized representations of a model of the invention or the secondary or three-dimensional structures 
of the invention , including any electronic, magnetic, or electromagnetic storage forms of the data needed to define 
the structures such that the data will be computer readable for purposes of display and/or manipulation. The 
invention further provides a computer programmed with a homology model of a ligand binding domain of a 
galactosyltransferase. The invention still further contemplates the use of a homology model of the invention as input 
to a computer programmed for drug design and/or database searching and/or molecular graphic imaging in order to 
identify new ligands for galactosyltransferases. 

These and other aspects of the present invention will become evident upon reference to the following 
detailed description and attached drawings. 
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BRIEF DESCRIPTION OF THE DRAWINGS 

The invention will now be described in relation to the drawings in which: 
Figure 1. Sequence alignment between SpsA and bovine ct-l,3-GaIT. 

Figure 2. A superposition of the SpsA structure and the a-l,3-GaiT model. The active site residues of SpsA 
5 and the corresponding residues of ct-l,3-GalT are shown as tubes. SpsA is shown in magenta and a-l ? 3-GalT is in 
blue. The side-chains of the a-l,3-GalT model are labeled. The active site modeled metal ion is shown as a red 
sphere. 

Figure 3. The low-energy computed docking modes of UDP to the a-l,3-GalT. About 60 low energy 
binding modes of UDP are shown in colored lines. The lowest energy binding mode is shown in thick tube. The 
10 critical amino acid residues are shown and labeled. All the low energy binders assume similar binding orientation. 

Figure A. Possible docking modes of UDP-Gal to the al,3-GalT. The lowest-energy docking mode is 
shown as thick tube and some of the low energy binding modes are shown as thin lines. 

Figure 5. The predicted complex of a-l-3GalT and the inhibitor. Two top ranking docking modes are 
shown and in both, the inhibitor occupies the acceptor and pyrophosphate binding regions of the a-l,3-GalT. The 
1 5 lowest energy-binding mode is shown in thick tube. 

Figure 6 shows the overall view of a docking model of bovine alpha 1,3 galT-UDP complex. GalT is shown 
in colored ribbon. The UDP is shown in think tubes. The amino acid residues that interact with UDP are shown in 
rubes and the modeled Mn 2+ is shown in a sphere. The conserved DVD motif interaction with a metal can be seen. 
Figure 7 shows an overall representation of the UDP-Gal complex. 
20 Figure 8 shows computed low energy binding modes of UDP-Gal. 

Figure 9 shows lowest energy binding modes of LacNAc-P-Ome to a-l,3-GalT. 
DESCRIPTION OF THE TABLES 

Table 1 - Atomic interactions between a galactosyltransferase and UDP. 
Table 2 - Characterization of the top five binding modes of UDP to a- 1 ,3-galactosyltransfease. 
25 Table 3 - Predicted secondary structures for the a-l,3-GaIT sequence that was used for generating a 

homology model of a-l,3-GalT. 

Table 4 - Structural coordinates of a galactosyltransferase 
Table 5 - Structural coordinates of UDP. 
Table 6 - Structural coordinates of UDP-Gal. 
30 Table 7 - Structural coordinates of uracil, ribose, and pyrophosphate of UDP. 

Table 8 - Structural coordinates of a galactosyltransferases. 

In Table 4, from the left, the second column identifies the atom number; the third identifies the atom type; 
the fourth identifies the amino acid type; the fifth identifies the residue number; the sixth identifies the x coordinates; 
the seventh identifies the y coordinates; and the eighth identifies the z coordinates. 
35 DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 
Definitions: 

Unless otherwise indicated, all terms used herein have the same meaning as they would to one skilled in the 
art of the present invention. Practitioners are particularly directed to Current Protocols in Molecular Biology 
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(Ansubel) for definitions and terms of the art. Abbreviations for amino acid residues are the standard 3 -letter and/or 
1 -letter codes used in the art to refer to one of the 20 common L-amino acids. 

The term "associate", "association" or "associating" refers to a condition of proximity between a ligand, 
chemical entity or compound or portions or fragments thereof, and a galactosyltransf erase, or portions or fragments 
5 thereof (e.g. ligand binding domain). The association may be non-covalent i.e. where the juxtaposition is 
energetically favored by for example, hydrogen-bonding, van der Waals, or electrostatic or hydrophobic interactions, 
or it may be covalent. 

The term "galactosyltransferase" refers to an enzyme that catalyzes the transfer of a single monosaccharide 
unit i.e. galactose, from a donor to the hydroxy! group of an acceptor saccharide. The acceptor can be either a free 
10 saccharide, glycoprotein, glycolipid, or polysaccharide. The donor can be a sugar nucleotide, preferably UDP-Gal. 
Galactosyltransferases show a precise specificity for both the sugar acceptor and donor and generally require the 
presence of a metal cofactor. 

Galactosyltransferases are derivable from a variety of sources, including viruses, bacteria, fungi, plants, and 
animals. In a preferred embodiment the galactosytransferases are derivable from an animal, preferably a mammal 
15 including but not limited to bovine, ovine, porcine, murine equine, most preferably a human. The enzyme may be 
from any source, whether natural, synthetic, semi -synthetic, or recombinant. Preferably the galactosyltransferase is a 
ocl.-3 galactosyltransferase, preferably derivable from bovine. 

A galactosyltransferase or part thereof in the present invention may be a wild type enzyme, or part thereof, 
or a mutant, variant or homologue of such an enzyme. 
20 The term "wild type" refers to a polypeptide having a primary amino acid sequence which is identical with 

the native enzyme (for example, the mammalian enzyme). 

The term "mutant" refers to a polypeptide having a primary amino acid sequence which differs from the 
wild type sequence by one or more amino acid additions, substitutions or deletions. Preferably, the mutant has at 
least 90% sequence identity with the wild type sequence. Preferably, the mutant has 20 mutations or less over the 
25 whole wild-type sequence. More preferably the mutant has 10 mutations or less, most preferably 5 mutations or less 
over the whole wild-type sequence. A mutant may or may not be functional: 

The term "variant" refers to a naturally occurring polypeptide which differs from a wild-type sequence. A 
variant may be found within the same species (i.e. if there is more than one isoform of the enzyme) or may be found 
within a different species. Preferably the variant has at least 90% sequence identity with the wild type sequence. 
30 Preferably, the variant has 20 mutations or less over the whole wild-type sequence. More preferably, the variant has 
10 mutations or less, most preferably 5 mutations or less over the whole wild-type sequence. 

The term "part" indicates that the polypeptide comprises a fraction of the wild-type amino acid sequence. It 
may comprise one or more large contiguous sections of sequence or a plurality of small sections. The "part" may 
comprise a ligand bindjng domain as described herein. The polypeptide may also comprise other elements of 
35 sequence, for example, it may be a fusion protein with another protein. Preferably the polypeptide comprises at least 
50%, more preferably at least 65%, most preferably at least 80% of the wild-type sequence. 

The term "homologue" means a polypeptide having a degree of homology with the wild-type amino acid 
sequence. The term "homology" can be equated with "identity". 

In the present context, a homologous sequence is taken to include an amino acid sequence which may be at 
40 least 75, 85 or 90% identical, preferably at least 95 or 98% identical to the wild-type sequence. Typically, the 
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homologues will comprise the same sites (for example ligand binding domain) as the subject amino acid sequence. 
Although homology can also be considered in terms of similarity (i.e. amino acid residues having similar chemical 
properties/functions), in the context of the present invention it is preferred to express homology in terms of sequence 
identity. 

5 Homology comparisons can be conducted by eye, or more usually, with the aid of readily available 

sequence comparison programs. These commercially available computer programs can calculate % homology 
between two or more sequences (e.g. Wilbur, W.J. and Lipman, D. J. Proc. Natl. Acad. Sci. USA (1983), 80:726- 
730). 

The term "function" refers to the ability of a modulator to enhance or inhibit the association between a 

10 galactosyltransferase and a compound, or the activity of the galactosyltransferase. 

"Ligand binding domain" refers to a region of a molecule or molecular complex that as a result of its shape, 
favourably associates with a ligand or a part thereof. For example, it may be a region of a galactoysltransferase that 
is responsible for binding a substrate or known modulator. 

The' term "ligand binding domain" includes homologues of a ligand binding domain or portions thereof. As 

15 used herein, the term "homologue" in reference to a ligand binding domain refers to a ligand binding domain or a 
portion thereof which may have deletions, insertions or substitutions of amino acid residues as long as the binding 
specificity of the molecule is retained. In this regard, deliberate amino acid substitutions may be made on the basis 
of similarity' in polarity, charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the 
residues as long as the binding specificity of the ligand binding domain is retained. 

20 As used herein, the term "portion thereof means the structural coordinates corresponding to a sufficient 

number of amino acid residues of a galactosyltransferase ligand binding domain (or homologues thereof) that are 
capable of associating with or interacting with a test compound that binds to the ligand binding domain. This term 
includes galactosyltransferase ligand binding domain amino acid residues having amino acid residues from about 4A 
to about 5A of a bound compound or fragment thereof. Thus, for example, the structural coordinates provided in the 

25 structure may contain a subset of the amino acid residues in the ligand binding domain which may be useful in the 
modelling and design of compounds that bind to the ligand binding domain. 

A ligand binding domain may be defined by its association with a ligand. With reference to the structures 
and models of the invention, residues in the ligand binding domain may be defined by their spatial proximity to a 
ligand. For example, such may be defined by their proximity to a substrate or modulator. 

30 A ligand binding domain of the invention may comprise a DVD motif comprising one or more of Asp-225, 

Val-226, and Asp-227. A ligand binding domain may comprise one or more of Phe-134, Tyr-139, Ile-140, Val-136, 
Arg-194, Arg-202, Lys-209 (numbered as ATOM 204 in Table 8), Asp- 173 (numbered as ATOM 169 in Table 8), 
His-218 (numbered as ATOM 213 in Table 8), and Thr-137 (numbered as ATOM 132 in Table 8) that binds uridine. 

"Ligand" refers to a compound or entity that associates with a ligand binding domain, including substrates 

35 or analogues or parts thereof. A ligand may be designed rationally using a model according to the invention. A 
ligand may be a modulator. 

"Modulator" refers to a molecule which changes or alters the biological activity of a galactosyltransferase. 
A modulator may increase or decrease galactosyltransferase activity, or change its characteristics, or functional or 
immunological properties. It may be an inhibitor that decreases the biological or immunological activity of the 

40 protein. A modulator may include but is not limited to peptides, members of random peptide libraries and 
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combinatorial chemistry-derived molecular libraries, phosph ©peptides (including members of random or partially 
degenerate, directed phosph opeptide libraries), antibodies, carbohydrates, monosaccharides, oligosaccharides, 
polysaccharides, glycolipids, saponins, heterocyclic compounds, nucleosides or nucleotides or parts thereof, and 
small organic or inorganic molecules. A modulator may be an endogenous physiological compound or it may be a 
5 natural or synthetic compound. The term "modulator" also refers to a chemically modified ligand or compound, and 
includes isomers and racemic forms. 

The term "structural coordinates" as used refers to a set of values that define the position of one or more 
amino acid residues with reference to a system of axes. A data set of structural coordinates defines the three 
dimensional structure of a molecule or molecules. Structural coordinates can be slightly modified and still render 
10 nearly identical three dimensional structures. A measure of a unique set of structural coordinates is the root-mean- 
square deviation of the resulting structure. Structural coordinates that render three dimensional structures that deviate 
from one another by a root-mean-square deviation of less than 2 A, preferably less than 0.5 A, more preferably less 
than 0.3 A, may be viewed by a person of ordinary skill in the art as identical. 

Variations in structural coordinates may be generated because of mathematical manipulations of the 
15 structural coordinates of a galactosyltransferase described herein. For example, the structural coordinates of Table 4 
or 8 may be manipulated by cry stall ographic permutations of the structural coordinates, fractionalization of the 
structural coordinates, integer additions or substractions to sets of the structural coordinates, inversion of the 
structural coordinates or any combination of the above. 

Variations in structure due to mutations, additions, substitutions, and/or deletions of the amino acids, or 
20 other changes in any of the components that make up a structure of the invention may also account for modifications 
in structural coordinates. If such modifications are within an acceptable standard error as compared to the original 
structural coordinates, the resulting structure may be the same. Therefore, a ligand that associates with or binds to a 
ligand binding domain of a galactosyltransferase would also be expected to associate with or bind to another ligand 
binding domain whose structural coordinates defined a shape that fell within the acceptable error. Such modified 
25 structures of a ligand binding domain are also within the scope of the invention. 

Various computational analyses may be used to determine whether a ligand or the ligand binding domain 
thereof is sufficiently similar to all or parts of a ligand or ligand binding domain of the invention. Such analyses may 
be carried out using conventional software applications and methods as described herein. 

The term "modeling" includes the quantitative and qualitative analysis of molecular structure and/or 
30 function based on atomic structural information and interaction models. The term includes conventional numeric- 
based molecular dynamic and energy minimization models, interactive computer graphic models, modified 
molecular mechanics models, distance geometry, and other structure-based constraint models. Preferably modeling 
is performed using a computer and may be optimized using known methods. This is called modeling optimization. 

The term "substrate" refers to molecules that associate with a galactosyltransferase as it catalyzes the 
35 transfer of a selected sugar from a nucleotide sugar donor to an acceptor that leads to the formation of a new 
glycosidic linkage. A substrate includes a sugar nucleotide donor and acceptor and parts thereof. 

A "sugar nucleotide donor" refers to a nucleotide coupled to a selected sugar that is transferred by a 
galactosyltransferase to an acceptor. The selected sugar may be a monosaccharide or disaccharide, preferably a 
monosaccharide. A suitable selected sugar includes galactose. The galatose may be modified for example, the 
40 hydroxyls may be blocked with acetonide, acylated, or alkylated or substituted with other groups such as halogen. 
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The nucleotide is preferably UDP. The heterocyclic amine base in the nucleotide may be modified. For example, 
when the base is uridine it may be modified at the C-5 or C-6 position with groups including but not limited to alkyl, 
aryl, gallic acid, and with electron donating and electron withdrawing groups: The sugar in the nucleotide (e.g. 
ribose) may be modified at the 2 J or 3' position with groups including but not limited to alkyl, aryl, gallic acid, and 

5 with electron donating and electron withdrawing groups. 

An "acceptor" refers to the part of a carbohydrate structure (e.g. glycoprotein, glycolipid) where the 
selected sugar of a sugar nucleotide donor is transferred by the galactosyltransferase. 

The term "alkyl", alone or in combination, refers to a branched or linear hydrocarbon radical, typically 
containing from 1 through 20 carbon atoms, preferably 1 through 10 carbon atoms, more preferably 1 to 6 carbon 

10 atoms. Typical alkyl groups include but are not limited to methyl, ethyl, 1 -propyl, 2-propyl, 1 -butyl, 2-butyl, tert- 
butyl, pentyl, hexyl, heptyl, octyl, nonyl, decyl, and the like. 

The term "alkenyl", alone or in combination, refers to an unsaturated branched or linear group typically^ 
having from 2 to 20 carbon atoms and at least one double bond. Examples of such groups include but are not limited 
to ethenyl, 1-propenyl, 2-propenyl, 1-butenyl, 1,3-butadienyl, 1-hexenyl, 2-hexenyl, 1-pentenyl, 2-pentenyl, and the 

15 like. 

The term "alkynyl", alone or in combination, refers to an unsaturated branched or linear group having 2 to 
20 carbon atoms and at least one triple bond. Examples of such groups include but are not limited to ethynyl; 1- 
propynyl, 2-propynyl, 1-butynyl, 2-butynyl, 1-pentynyl, and the like. 

The term "cycloalkyl" refers to cyclic hydrocarbon groups and includes but is not limited to cyclopropyl, 
20 cyclobutyl, cyclopentyl, cyclohexyl, cycloheptyl, and cyclooctyl. 

The term "aryl", alone or in combination, refers to a monocyclic or polycyclic group, preferably a 
monocyclic or bicyclic group. An aryl group may optionally be substituted as described herein. Examples of aryl 
groups and substituted aryl groups are phenyl, benzyl, p-nitrobenzyl, p-methoxy benzyl, biphenyl, and naphthyl. 

The term "alkoxy" alone or in combination, refers to an alkyl or cycloalkyl linked to the parent molecular 
25 moiety through an oxygen atom. The term "aryloxy" refers to an aryl linked to the parent molecular moiety through 
an oxygen atom. Examples of alkoxy groups are methoxy, ethoxy, propoxy, vinyloxy, allyloxy, butoxy, pentoxy, 
hexoxy, cyclopentoxy, and.cyclohexoxy. Examples of aryloxy groups are phenyloxy, O-benzyl i.e. benzyloxy, O-p- 
nitrobenzyl and O-p-methyl-benzyl, 4-nitrophenyloxy, 4-chlorophenyloxy, and the like. 

The term "halo" or "halogen", alone or in combination, means fluoro, chloro, bromo, or iodo. 
30 The term "amino alone or in combination, refers to a chemical functional group where a nitrogen atom (N) 

is bonded to three substituents being any combination of hydrogen, alkyl, cycloalkyl, alkenyl, alkynyl, or aryl with 
the general chemical formula -NR H Ri6 where R 14 and R, 6 can be any combination of hydrogen, alkyl, cycloalkyl, 
alkenyl, alkynyl, or aryl. Optionally one substituent on the nitrogen atom can be a hydroxyl group (-OH) to give an 
amine known as a hydroxylamine. Examples of amino groups are amino (~NH 2 ), methylamine, ethylamine, 
35 dimethylamine, 2-propylamine, butylamine, isobutylamine, cyclopropylamine, benzylamine, ailylamine, 
hydroxylamine, cyclohexylamino (-NHCH(CH 2 ) 5 ), piperidine (-N(CH 2 ) 5 ) and benzylamino (-NHCH 2 C 6 H 5 ). 

The term "thioalkyl", alone or in combination, refers to a chemical functional group where a sulfur atom (S) 
is bonded to an alkyl. Examples of thioalkyl groups are thiomethyl, thioethyl, and thiopropyl. 

The term "thioaryl", alone or in combination, refers to a chemical functional group where a sulfur atom (S) 
40 is bonded to an aryl group with the general chemical formula -SR 16 where R i6 is an aryl group which may be 
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substituted. Examples of thioaryl groups arid substituted thioaryl groups are thiophenyl, para-chlorothiophenyl, 
rhiobenzyl, 4-methoxy-thiopbeny], 4-nirro-thiophenyl, and para-nitrothiobenzyl. 

Heterocyclic rings are molecular rings where one or more carbon atoms have been replaced by hetero atoms 
(atoms not being carbon)' such as for example, oxygen (O), nitrogen (N) or sulfur (S), or combinations thereof. 
5 Examples of heterocyclic rings include ethylene oxide; tetrahydrofuran, thiophene, piperidine (piperidinyl group), 
pyridine (pyridinyl group), and caprolactam. A carbocyclic or heterocyclic group may be optionally substituted at 
carbon or nitrogen atoms with .for example, alkyl, phenyl, benzyl or thienyl, or a carbon atom in the heterocyclic 
group together with an oxygen atom may form a carbonyl group, or a heterocyclic group may be fused with a phenyl 
group. 

10 Three Dimensional Structure of Galactosyltransferases and Ligand Binding Domains of Same 

The present invention provides a galactosyltransf erase secondary, tertiary and/or quanternary structure. The 
invention also provides a homology model that represents the secondary, tertiary, and/or quanternary structure of a 
galactosyltransferase. A model may, for example, be a structural model (or representation thereof), or a computer 
model. The model itself may be in two or three dimensions. It is possible for a computer model to be in three 
15 dimensions despite the constraints imposed by a conventional computer screen, if it is possible to scroll along at least 
a pair of axes, causing "rotation" of the image. 

-In accordance with an aspect of the invention a method is provided for designing a homology model of a 
ligand binding domain of a galactosytransferase wherein the. homology model may be displayed as a three- 
dimensional image, the method comprising: 
20 (i) providing an amino acid sequence and structural coordinates of a ligand binding domain structure 

of a glycosyltransferase, preferably SpsA glycosytransferase; 
(ii) modifying said structure to take into account differences between the amino acid configuration of 
the ligand binding domains of. the galactosyltransferase on the one hand and the SpsA 
glycosyltransferase on the other hand to generate a homology model, and 
25 (in) if required refining the homology model. 

The method may further comprise comparing the homology model with the structures of other, similar, 

proteins. 

A model or structure of a preferred galactosyltransferase of the invention has the atomic structural 
coordinates as shown in Table 4 or Table 8. Computer representations of the structure i.e. models are illustrated in 
30 the Figures. 

The structural coordinates in a structure or model of the invention may comprise the amino acid residues of 
a galactosyltransferase ligand binding domain, or a portion or homolog thereof useful in the modeling and design of 
test compounds capable of binding to the galactosyltransferase. Therefore, the invention also relates to a secondary 
and three dimensional structure or model of a ligand binding domain of a galactosyltransferase. Ligand binding 
35 domains include the ligand binding domains for a disphosphate group of a sugar nucleotide donor, a nucleotide of a 
sugar nucleotide donor, a nitrogeneous heterocyclic base (preferably a pyrimidine base, more preferably uracil) of a 
sugar nucleotide donor, and/or a sugar (e.g. galactose) of a sugar nucleotide donor. The structure of a ligand binding 
domain may be defined by selected atomic interactions or contacts in the domain, preferably two or more of the 
atomic interactions or contacts as defined in Table 1 . 
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It is understood that a structure or mode) of the invention includes a structure where at least one amino acid 
residue is replaced with a different amino -acid residue or by adding or deleting amino acid residues, and having 
substantially the same three dimensional structure as the galactosyltransferase as described in Table 4 and the 
Figures, or the Iigand binding domains as described in Table 1 (and further defined by the structural coordinates of 
5 the ATOMS in Table 4 or Table 8), i.e. having a set of atomic structural coordinates that have a root mean square 
deviation of less than or equal to about 2A, preferably less than 0.5 A, most preferably less than 0.3A, when 
superimposed with the atomic structure coordinates of the galactosyltransferase as described in Table 4 or Table 8, 
or the binding domains as described in Table 1 (and further defined by the structural coordinates of the ATOMS in 
Table 4) when at least 50% to 1 00% of the atoms of the sugar nucleotide donor binding domain or binding domains 

10 of components of the donor as the case may be, are included in the superirhposition. 

The invention also features a secondary and three dimensional structure or model of a galactosyltransferase 
in association with one of more molecules (e.g. substrates such as UDP-Gal, uridine-ribose, monophophate-Mn 2+ , or 
diphosphate-Mn 2+ ). The association may be covalent or non-covalent. The molecule may be any organic molecule, 
and it may modulate the function of a galactosyltransferase by for example inhibiting or enhancing its function, or it 

15 may be an acceptor or donor for the galactosyltransferase. It is preferred that the geometry of the compound and the 
interactions formed between the compound and. the galacytosy transferase provide high affinity binding between the 
two molecules. 

The structure and model of the galactosyltransferase decribed herein has allowed the identification and 
characterization of the binding domain of UDP and UDP-Gal. The UDP-Gal binding domain has been subdivided 

20 into, three sub-sites (the uracil-binding domain, the ribose-binding domain, the diphosphate-Mri 24 binding domain, 
and the Gal binding domain) and characterized. 

Therefore, in an embodiment of the invention, a secondary and three dimensional structure or model of a 
Iigand binding domain of a galactosyltransferase that binds a diphosphate of a sugar nucleotide donor is provided 
comprising at least two of atomic interactions 9, 10, and 11 of Table 1, each atomic interaction defined therein by an 

25 atomic contact (more preferably, a specific atom where indicated) on the diphosphate, and an atomic contact (more 
preferably, a specific amino acid residue where indicated) on the galactosyltransferase (i.e. enzyme atomic contact). 
In a preferred embodiment, the Iigand binding domain comprises atomic interactions 9 and 10, 10 and 1 1, 9 and 1 1, , 
or 9, 10, and 11 of Table 1. Preferably, the binding domain is defined by the atoms of the enzyme atomic 
interactions having the structural coordinates for the atoms listed in Table 4 or Table 8. Therefore, in an embodiment 

30 of the invention the binding domain is defined by the structural coordinates referred to as ATOM 1690, and ATOM 
1718 of Table 8most preferably ATOM 1690 to ATOM 1718 inclusive of Table 8. The binding domain of a 
galactosyltransferase for a diphosphate of a sugar nucleotide donor is also characterized by a DVD motif (Asp-225, 
Val-226, and Asp-227). 

In another embodiment of the invention, a secondary or three dimensional structure or model of a Iigand 
35 binding domain of a galactosyltransferase that binds a heterocyclic amine base of a sugar nucleotide donor is 
provided comprising at least two, preferably three, of atomic interactions 1, 2, 3, and 4 of Table 1, each atomic 
interaction defined therein by an atomic contact (more preferably, a specific atom where indicated) on the 
heterocyclic amine base, and an atomic contact (more preferably, a specific amino acid residue where indicated) on 
the galactosyltransferase (i.e. enzyme atomic contact). In a preferred embodiment, the Iigand binding domain 
40 comprises atomic interactions 1 and 2; 1 and 3; 1 and 4; 2 and 3; 2 and 4; 3 and 4; or 1, 2, and 3; 2, 3, and 4; 1,3, 
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and 4; 1,2, and 4; or 1, 2, 3 and 4 of Table 1. Preferably, the binding domain is defined by the atoms of the enzyme 
atomic interactions having the structural coordinates for the atoms listed in Table 4 or Table 8. Therefore, in an 
embodiment of the invention the binding domain is defmed by the structural coordinates referred to as ATOM 720, 
ATOM 1360, ATOM 1490, ATOM 154 to ATOM 155 in Table 8. The ligand binding domain of a 
5 galactosyltransferase for a heterocyclic amine base of a sugar nucleotide donor is also characterised by two helices 
and two p sheets in anti-parallel fashion. A ligand binding domain for uracil can also be characterized by the 
following three hydrogen bonds: (1) the amide hydrogen of uracil in position 3 and OD1 of Asp-168, (2) the 
carbonyl oxygen of uracil in position 4 and the side chain of Lys-204, and (3) the carbonyl oxygen of uracil in 
position 2 and the amide hydrogen of the His-2 13 side chain. 

10 In another embodiment of the invention, a secondary and three dimensional structure or model of a ligand 

binding domain of a galactosyltransferase that binds the sugar of the nucleotide (e.g. ribose) of a sugar nucleotide 
donor is provided comprising at least two, preferably three, of atomic interactions 5, 6, 7, and 8 of Table 1, each 
atomic interaction defined therein by an atomic contact (more preferably, a specific atom where indicated) on the 
sugar, and an atomic contact (more preferably, a specific amino acid residue where indicated) on the 

15 galactosyltransferase (i.e. enzyme atomiexontact). In a preferred embodiment, the binding domain comprises atomic 
interactions 5 and 6; 5 and 7; 5 and 8; 6 and 7; 6 and 8; 7 and 8; 5, 6, and 7; 5, 6, and 8; 6, 7, and 8; 5, 7, and 8; and 
5, 6, 7,and 8 of Table 1. Preferably, the ligand binding domain is defined by the atoms of the enzyme atomic 
interactions having the structural coordinates for the atoms listed in Table 4 or Table 8. Therefore, in an embodiment 
of the invention the binding domain is defined by the structural coordinates referred to as ATOM 1690, ATOM 97 to 

20 ATOM 1 15, ATOM 1436 to ATOM 1454 of Table 8. 

Atomic interactions 1 through 1 1 in Table 1 are preferably each characterized by the types of binding and/or 
the distances between atomic contacts indicated in Table 1. 

In another embodiment of the invention, a secondary or three dimensional structure of a ligand binding 
domain of a galactosyltransferase that binds a nucleotide (preferably UDP) of a sugar nucleotide donor is provided 

25 comprising at least two or more of atomic interactions 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or 11 of Table 1, each atomic 
interaction defined therein by an atomic contact (more preferably, a specific atom where indicated) on the 
nucleotide, and an atomic contact (more preferably, a specific amino acid residue where indicated) on the 
galactosyltransferase (i.e. enzyme atomic contact). In a preferred embodiment, the binding domain comprises 
atomic interactions 2, 3, 5, 6, , 9, 10, and 3.1; 4, 7, 8, 9, 10, and 11; 1, 2, 3, 5, 6, 9, 10, 11, or 1 to 11 inclusive of 

30 Table 1. Preferably, the ligand binding domain is defined by the atoms of the enzyme atomic interactions having the 
structural coordinates for the atoms listed in Table 4 or Table 8. Therefore, in an embodiment of the invention the 
ligand binding domain is defmed by the structural coordinates referred to as ATOM 720, ATOM 1360, ATOM 1490, 
ATOM 154, ATOM 155, ATOM 1690, ATOM 97 to ATOM 1 15, ATOM 1436 to ATOM 1454, and ATOM 1718, 
of Table 8. The binding domain of a galactosyltransferase for a nucleotide of a sugar nucleotide donor is also 

35 characterized by a 100 amino acid nucleotide recognition domain. 

A UDP binding domain of a galactosyltransferase is also characterized by an open a,(3-sandwich made up 
of three helices packed against four (3-sheets. The following amino acid residues have also been identified to be part 
of the UDP binding domain: Phe-134, Typ-139, lle-140, Val-136, Arg-194 } Arg-202, Lys-209, Asp-173, His-218, 
Thr-137, Asp-225, Val-226, and Asp-227. 
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In yet another embodiment of the invention, a secondary and three dimensional structure or model of a 
ligand binding domain of a galactosyltransferase that binds a sugar nucleotide donor (preferably UDP-Gal) is 
provided comprising at least three of the atomic interactions of Table 1, each atomic interaction defined therein by 
an atomic contact (more preferably, a specific atom where indicated) on the sugar nucleotide donor, and an atomic 

5 contact (more preferably, a specific amino acid residue where indicated) on the galactosyltransferase (i.e. enzyme 
atomic contact). In a preferred embodiment, the binding domain comprises atomic interactions 1 to 1 L inclusive of 
Table 1. Preferably, the ligand binding domain is defined by the atoms of the enzyme atomic interactions having the 
structural coordinates for the atoms listed in Table 4 or Table 8. Therefore, in an embodiment of the invention the 
ligand binding domain is defined by the structural. coordinates referred to as ATOM 720, ATOM 1360, ATOM 1490, 

10 ATOM 154, ATOM 155, ATOM 1690, ATOM 97 to ATOM 1 15, ATOM 1436 to ATOM 1454, and ATOM 1718 
of Table 4. 

Identification of Homologues 

The knowledge of the structures and models of the invention enables one skilled in the art to identify 
homologues of galactosyltransferases. This is achieved by searches of three-dimensional databases. Since structural 

15 folds are conserved to a greater extent than sequence, one may identify homologues with very little sequence identity 
or similarity. Programs that provide this type of database searching are known in the art and include Dal and the Fold 
recognition server located at UCLA (8). The structural coordinates of a protein structure are submitted and the 
program performs a multiple structural alignment with proteins in the protein data bank. Homologues identified in 
accordance with the present invention may be used in the methods of the invention described herein. 

20 Computer Format of Structures/Models 

Information derivable from the structures of the present invention (for example the structural coordinates) 
or a model of the present invention may be provided in a computer-readable format. 

Therefore, the invention provides a computer readable medium or a machine readable storage medium 
which comprises the models of the invention or structural coordinates of a galactosyltransferase including all or any 

25 parts of the galactosyltransfersae (e.g ligand-binding domain), ligands including portions thereof, or substrates 
including portions thereof. Such storage medium or storage medium encoded with these data are capable of 
displaying on a computer screen or similar viewing device, a three-dimensional graphical representation of a 
molecule or molecular complex which comprises the enzyme or ligand binding domains or similarly shaped 
homologous enzymes or ligand binding domains. Thus, the invention also provides computerized representations of 

30 a model or structure of the invention, including any electronic, magnetic, or electromagnetic storage forms of the 
data needed to define the structures such that the data will be computer readable for purposes of display and/or 
manipulation. 

In an aspect the invention provides a computer for producing a model or three-dimensional representation 
of a molecule or molecular complex, wherein said molecule or molecular complex comprises a galactosyltransferase 
35 or Jigand binding domain thereof defined by structural coordinates of galactosyltransferase amino acids or a ligand 
binding domain thereof, or. comprises structural coordinates of atoms of a ligand or substrate, or a three-dimensional 
representation of a homologue of said molecule or molecular complex, wherein said computer comprises: 

(a) a machine-readable data storage medium comprising a data storage material encoded with machine 
readable data wherein said data comprises the structural coordinates of a galactosyltransferase 
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amino acids according to Table 4 or Table 8 or a ligand binding domain thereof, or a ligand 
according to Table 5 } - 6, or 7; 

(b) a working memory for storing instructions for processing said machine-readable data; 

(c) a central-processing unit coupled to said working memory and to said machine-readable data 
5 storage medium for processing said machine readable data into said three-dimensional 

representation; and 

(d) a display coupled to said central-processing unit for displaying said three-dimensional 
representation. 

A hornologue may comprise a galactosyltransferase or ligand binding domain thereof, or ligand or substrate that 
10 has a root mean square deviation from the backbone atoms of not more than 1 .5 angstroms. 

The invention also provides a computer for determining at least a portion of the structural coordinates 
corresponding to an X-ray diffraction pattern of a molecule or molecular complex wherein said computer comprises: 

(a) a machine-readable data storage medium comprising a data storage material encoded with machine 
readable data wherein said data comprises the structural coordinates according to Table 4, 5, 6, 7, 

15 or 8; 

(b) a machine-readable data storage medium comprising a data storage material encoded with machine ' 
readable data wherein said data comprises an X-ray diffraction pattern of said molecule or 
molecular complex; 

(c) a working memory for storing instructions for processing said machine-readable data of (a) and 
20 (b); . 

(d) a central-processing unit coupled to said working memory and to said machine-readable data 
storage medium of (a) and (b) for performing a Fourier transform of the machine readable data of 
(a) and for processing said machine readable data of (b) into structural coordinates; and . 

(e) a display coupled to said central-processing unit for displaying said structural coordinates of said 
25 molecule or molecular complex. 

The invention also contemplates a computer programmed with a homology model of a ligand binding 
domain according to the invention; a machine-readable data-storage medium on which has been stored in machine- 
readable form a homology model of a ligand binding domain of a galactosyltransferase; and the use of a homology 
model as input to a computer programmed for drug design and/or database searching and/or molecular graphic 

30 imaging in order to identify new ligands for galactosyltransf erases. 
Structural Determinations 

The present invention also provides a method for determining the secondary and/or tertiary structures of a 
polypeptide by using a model according to the invention. The polypeptide may be any polypeptide for which the 
secondary and or tertiary structure is uncharacterised or incompletely characterised. In a preferred embodiment the 

35 polypeptide shares (or is predicted to share) some structural or functional homology to a galactosyltransferase, 
preferably a (31,3 galactosyltranferase. For example, the polypeptide may show a degree of structural homology 
over some or al) parts of the primary amino acid sequence. For example the polypeptide may have one or more 
domains which show homology with a galactosyltransferase domain (Kapitonov and Yu (1999) Glycobiology 9(10): 
961-978). 
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The polypeptide may be a galactosyltransf erase with a different specificity for a ligand or substrate. The 
polypeptide may be a galactosyltransferase which requires a different metal cofactor. Alternatively (or in addition) 
the polypeptide may be a galactosyltransferase from a different species. 

The polypeptide may be a mutant of the wild-type galactosyltransferase. A mutant may arise naturally, or 
5 may be made artificially (for example using molecular biology techniques). Hie mutant may also not be "made" at 
all in the conventional sense, but merely tested theoretically using the model of the present invention. A mutant may 
or may not be functional. 

Thus, using a model of the present invention, the effect of a particular mutation on the overall two and/or 
three dimensional structure of a galactosyltransferase and/or the interaction between the enzyme and a ligand or 

10 substrate can be investigated. Alternatively, the polypeptide may perform an analogous function or be suspected to 
show a similar catalytic mechanism to the galactosyltransferase enzyme. For example the polypeptide may transfer 
a sugar residue from a sugar nucleotide donor. 

The polypeptide may alsd be the same as the polypeptide described herein, but in association with a 
different ligand (for example, modulator or inhibitor) or cofactor. In this way it is possible to investigate the effect of 

15 altering a ligand or compound with which the polypeptide is associated on the structure of a ligand binding domain. 

Secondary or tertiary structure may be determined by applying the structural coordinates of the model of the 
present invention to other data such as an amino acid sequence, X-ray crystallographic diffraction data, or nuclear 
magnetic resonance (NMR) data. Homology modeling, molecular replacement, and nuclear magnetic resonance 
methods using these other data sets are described below. 

20 Homology modeling (also known as comparative modeling or knowledge-based modeling) methods 

develop a three dimensional model from a polypeptide sequence based on the structures of known proteins (e.g. 
native or mutated galactosyltransferases). In the present invention the method utilizes a computer representation of 
the structure of a galactosyltransferase, or a binding domain or complex of same as described herein, a computer 
representation of the amino acid sequence of a polypeptide with an unknown structure (additional native or mutated 

25 galactosyltransferases), and standard computer representations of the structures of amino acids. The method in 
particular comprises the steps of; (a) identifying structurally conserved and variable regions in the known structure; 
(b) aligning the amino acid sequences of the known structure and unknown structure (c). generating coordinates of. 
main chain atoms and side chain atoms in structurally conserved and variable regions of the unknown structure 
based on the coordinates of the known structure thereby obtaining a homology model; and (d) refining the homology 

30 model to obtain a three dimensional structure for the unknown structure. This method is. well known to those skilled 
in the art (Greer, 1985, Science 228, 1055; Bundell et al 1988, Eur. J. Biochem. 172, 513; Knighton et aL, 1992, 
Science 258:130-135, http://biochem.vt.edu/courses/modeling/ homology.htn). Computer programs that can be used 
in homology modeling are Quanta and the Homology module in the Insight II modeling package distributed by 
Molecular Simulations Inc, or MODELLER (Rockefeller University, www.iucr.ac.uk/sinris-top/logical/prg- 

35 modeller.html). 

In step (a) of the homology modeling method, a known galactosyltransferase structure is examined to 
identify the structurally conserved regions (SCRs) from which an average structure, or framework, can be 
constructed for these regions of the protein. Variable regions (VRs), in which known structures may differ in 
conformation, also must be identified. SCRs generally correspond to the elements of secondary structure, such as 
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alpha-helices and beta-sheets, and to ligand- and substrate-binding sites (e.g. acceptor and donor binding sites). The 
VRs usually lie on the surface of the proteins and form the loops where the main chain turns. 

Many methods are available for sequence alignment of known structures and unknown structures. Sequence 
alignments generally are based on the dynamic programming algorithm of Needleman and Wunsch [J. Mol. Biol. 48: 
5 442-453, 1970]. Current methods include FASTA, Smith- Waterman, and BLASTP, with the BLASTP method 
differing from the other two in not allowing gaps. Scoring of alignments typically involves construction of a 20x20 
matrix in which identical amino acids and those of similar character (i.e., conservative substitutions) may be scored 
higher than those of different character. Substitution schemes which may be used to score alignments include the 
scoring matrices PAM (Dayhoff et al., Meth. Enzymol. 91: 524-545, 1983), and BLOSUM (Henikoff and Henikoff, 

10 Proc. Nat. Acad. Sci. USA 89: 10915-"0919, 1992), and the matrices based on alignments derived from three- 
dimensional structures including that of Johnson and Overington (JO matrices) (J. Mol. Biol. 233: 716-738, 1993). 

Alignment based solely on sequence may be used, though other structural features also may be taken into 
account. In Quanta, multiple sequence alignment algorithms are available that may be used when aligning a 
sequence of the unknown with the known structures. Four scoring systems (i.e. sequence homology, secondary 

15 structure homology, residue accessibility homology, CA-CA distance homology) are available, each of which may 
be evaluated during an alignment so that relative statistical weights may be assigned. 

When generating coordinates for the unknown structure, main chain atoms and side chain atoms, both in 
SCRs and VRs need to be modeled. A variety of approaches may be used to assign coordinates to the unknown. In 
particular, the coordinates of the main chain atoms of SCRs will be transferred to the unknown structure. VRs 

20 correspond most often to the loops on the surface of the polypeptide and if a loop in the known structure is a good 
model for the unknown, then the main chain coordinates of the known structure may be copied. Side chain 
coordinates of SCRs and VRs are copied if the residue type in the unknown is identical to or very similar to that in 
the known structure. For other side chain coordinates, a side chain rotamer library may be used to define the side 
chain coordinates. When a good model for a loop cannot be found fragment databases may be searched for loops in 

25 other proteins that may provide a suitable model for the unknown. If desired, the loop may then be subjected to 
conformational searching to identify low energy conformers if desired. 

Once a homology model has been generated it is analyzed to determine its correctness. A computer 
program available to assist in this analysis is the Protein Health module in Quanta which provides a variety of tests. 
Other programs that provide structure analysis along with output include PROCHECK and 3D-Profiler [Luthy R. et 

30 al, Nature 356: 83-85, 1992; and Bowie, J.U. et al, Science 253:. 164-170, 1991]. Once any irregularities have been 
resolved, the entire structure may be further refined. Refinement may consist of energy minimization with restraints, 
especially for the SCRs. Restraints may be gradually removed for subsequent minimizations. Molecular dynamics 
may also be applied in conjunction with energy minimization. 

The structural coordinates of a galactosyltransf erase structure may be applied to nuclear magnetic resonance 

35 (NMR) data to determine the three dimensional structures of polypeptides in solution (e.g. additional native or 
mutated galactosy transferases). (See for example, Wuthrich, 1986, John Wiley and Sons, New York: 176-199; 
Pflugrath et al., 1986, J. Molecular Biology 189: 383-386; Kline et al., 1986 J. Molecular Biology 189:377-382). 
While the secondary' structure of a polypeptide may often be determined by NMR data, the spatial connections 
between individual pieces of secondary structure are not as readily determined. The structural coordinates of a 

40 polypeptide can guide the NMR spectroscopist to an understanding of the spatical interactions between secondary 
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structural elements in a polypeptide of related structure. Information on spatial interactions between secondary 
structural elements can greatly simplify Nuclear Overhauser Effect (NOE) data, from two-dimensional NMR 
experiments. In addition, applying the structural coordinates after the determination of secondary structure by NMR 
techniques simplifies the assignment of NOE' s relating to particular amino acids in the polypeptide sequence and 

5 does not greatly bias the NMR analysis of polypeptide structure. 

In an embodiment, the invention relates to a method of determining three dimensional structures of 
polypeptides with unknown structures, preferably a native or mutated galactosyltransferases, by applying the 
structural coordinates of a galactosyltransferase structure, or ligand binding domain or complex thereof described 
herein to nuclear magnetic resonance (NMR) data of the unknown structure. This method comprises the steps of: (a) 

1 0 determining the secondary structure of an unknown structure using NMR data; and (b) simplifying the assignment of 
through-space interactions of amino acids. The term " through-space interactions" defines the orientation of the 
secondary structural elements in the three dimensional structure and the distances between amino acids from 
different portions of the amino acid sequence. The term "assignment" defines a method of analyzing NMR data and 
identifying which amino acids give rise to signals in the NMR spectrum. 

15 Screening Method 

The present invention provides a method of screening for a ligand that associates with a ligand binding 
domain and/or modulates the function of a galactosyltranssferase, by using a structure or a model according to the 
present invention. The method may involve investigating whether a test compound is capable of associating with or 
binding a ligand binding domain. 
20 In accordance with an aspect of the present invention,, a method is provided for screening for a ligand 

capable of associating with or binding to a ligand binding domain, wherein said method comprises the use of a 
structure or model according to the invention. 

In another aspect, the invention relates to a method of screening for a ligand capable of associating with or 
binding to a ligand binding domain, wherein the ligand binding domain is defined by the amino acid residue 
25 structural coordinates given herein, the method comprising contacting the ligand binding domain with a test 
compound and determining if said test compound binds to said ligand binding domain. 

In one embodiment, the present invention provides a method of screening for a test compound capable of 
interacting with a key amino acid residue of a ligand binding domain, of a galactosyltransferase. 
Another aspect of the invention provides a process comprising the steps of: 
30 (a) performing the method of screening for a ligand as described above; 

(b) identifying one or more ligands capable of binding to a ligand binding domain; and 

(c) preparing a quantity of said one or more ligands. 

A further aspect of the invention provides a process comprising the steps of: 
(a) performing the method of screening for a ligand as described above; 
35 (b) identifying one or more ligands capable of binding to a ligand binding domain; and 

(c) preparing a pharmaceutical composition comprising said one or more ligands. 
Once a test compound capable of interacting with a key amino acid residue in a galactosyltransferase ligand 
binding domain has been identified, further steps may be carried out either to select and/or to modify compounds 
and/or to modify existing compounds, to modulate the interaction with the key amino acid residues in the 
AO galactosyltransferase ligand binding domain. 
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Yet another aspect of the invention provides a process comprising the steps of: 
(a) performing the method of screening for a ligand as described above; 
. (b) identifying one or more ligands capable^of binding to a ligand binding domain; 
(c) modifying said one or more ligands capable of binding to a ligand binding domain; 
5 (d) performing said method of screening for a ligand as described above; 

(e) optionally preparing a pharmaceutical composition comprising said one or more ligands. 
As used herein, the term "test compound" means any compound which is potentially capable of associating 
with a ligand binding domain. If, after testing, it is determined that the test compound does associate with or bind to 
the ligand binding domain, it is known as a "ligand". 
10 A "test compound" includes, but is not limited to, a compound which may be obtainable from or produced 

by any suitable source, whether natural or not. The test compound may be designed or obtained from a library of 
compounds which may comprise peptides, as well as other compounds, such as small organic molecules and 
particularly new lead compounds. By way of example, the test compound may be a natural substance, a biological 
macromolecule, or an extract made from biological materials such as bacteria, fungi, or animal (particularly 
15 mammalian) cells or tissues, an organic or an inorganic molecule, a synthetic test compound, a semi-synthetic test 
compound, a carbohydrate, a monosaccharide, an oligosaccharide or polysaccharide, a glycolipid, a glycopeptide, a 
saponin; a heterocyclic compound, a structural or functional mimetic, a peptide, a peptidomimetic, a derivatised test 
compound, a peptide cleaved from a whole protein, or a peptide synthesised synthetically (such as, by way of 
- example, either using a peptide synthesizer or by recombinant techniques or combinations thereof), a recombinant 
20 test compound, a natural or a non-natural test compound, a fusion protein or equivalent thereof and mutants, 
. derivatives or combinations thereof. 

-The test compound may be screened as part of a library or a data base of molecules. Data bases which may 
be used include ACD (Molecular Designs Limited), NCI (National Cancer Institute), CCDC. (Cambridge 
Crystal lographic Data Center), CAST (Chemical Abstract Service), Derwent (Derwent Information Limited), 
25 Maybridge (Maybridge Chemical Company Ltd), Aldrich (Aidrich Chemical Company), DOCK (University of 
California in San Francisco), and the Directory of Natural Products (Chapman & Hall). Computer programs such as - 
CONCORD (Tripos Associates) or DB-Converter (Molecular Simulations Limited) can be used to convert a data set 
represented in two dimensions to one represented in three dimensions. 

Test compounds may be tested for their capacity to fit spatially into a galactosyltransferase ligand binding 
30 domain. As used herein, the term "fits spatially" means that the three-dimensional structure of the test compound is 
accommodated geometrically in a galactosyltransferase ligand binding domain. The test compound can then be 
considered to be a ligand. 

A favourable geometric fit occurs when the surface area of the test compound is in close proximity with the 
surface area of the cavity or pocket without forming unfavorable interactions or associations. A favourable 
35 complementary interaction occurs where the test compound interacts by hydrophobic, aromatic, ionic, dipolar, or 
hydrogen donating and accepting forces. Unfavourable interactions or associations may be steric hindrance between 
atoms in the test compound and atoms in the binding site. 

In an embodiment of the invention, a method is provided for identifying potential modulators of 
galactosyltransferase function. The method utilizes the structural coordinates or model of a galactosyltransferase 
40 three dimensional structure, or binding domain thereof. The method comprises the steps of (a) docking a computer 
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representation of a test compound from a computer data base with a computer model of a ligand binding domain of a 
galactosyltransferase; (b) determining a conformation of a complex between the test compound and binding domain 
with a favourable geometric fit or favorable complementary interactions; and (c) identifying test compounds that 
best fit the galactosyltransferase binding domain as potential modulators of galactosyltransferase function. The initial 

5 galactosyltransferase structure may or may not have substrates bound to it. A favourable complementary interaction 
occurs where a compound in a compound-galactosyltransferase complex interacts by hydrophobic, ionic, or 
hydrogen donating and accepting forces, with the active-site or ligand binding domain of a galactosyltransferase 
without forming unfavorable interactions. 

If a model of the present invention is a computer model, the test compounds may be positioned in a ligand 

10 binding domain through computational docking. If, on the other hand, the model of the present invention is a 
structural model, the test compounds may be positioned in the ligand binding domain by, for example, manual 
docking. 

As used herein the term "docking" refers to a process Of placing a compound in close proximity with a 
galactosyltransferase ligand binding domain, or a process of finding low energy conformations of a test compound/ 
15 galactosyltransferase complex. 

A screening method of the present invention may comprise the following steps: 

(i) generating a computer model of a galactosyltransferase or a ligand binding domain thereof 
according to the first aspect of the invention; 

(ii) docking a computer representation of a test compound with the computer model; 

20 (iii) analysing the fit of the compound in the galactosyltransferase or ligand binding domain thereof. 

In an aspect of the invention a method is provided comprising the following steps: 

(a) docking a computer representation of a structure of a test compound into a computer representation 
of a ligand binding domain of a galactosyltransferase defined in accordance with the invention 
using a computer program, or by interactively moving the representation of the test compound into 

25 the representation of the binding domain; 

(b) characterizing the geometry and the complementary interactions formed between the atoms of the 
ligand binding domain and the test compound; optionally 

(c) searching libraries for molecular fragments which can fit into the empty space between the 
compound and ligand binding domain and can be linked to the compound; and 

30 (d) linking the fragments found in (c) to the compound and evaluating the new modified compound. 

In an embodiment of the invention a method is provided which comprises the following steps: 
(a) docking a computer representation of a test compound from a computer data base with a computer 
representation of a selected site (e.g. an inhibitor binding domain) on a galactosyltransferase 
structure or model defined in accordance with the invention to obtain a complex; 
35 (b) determining a conformation of the complex with a favourable geometric fit and favourable 

complementary interactions; and 
(c) identifying test compounds that best fit the selected site as potential modulators of the 
galactosyltransferase. 

A method of the invention may be applied to a plurality of test compounds, to identify those that best fit the 
40 selected site. 
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The model used in the screening method may comprise a galactosyltransferase or ligand binding domain 
thereof either alone or in association with one or more ligands and/or cofactors. For example, the model may 
comprise the ligand-binding domain in association with a ligand, substrate, or analogue thereof. 

If the model comprises an unassociated ligand binding domain, then the selected site under investigation 
5 may be the ligand binding domain itself. The test compound may, for example, mimic a known substrate for the 
enzyme in order to interact with the ligand binding domain. The selected site may alternatively be another site on 
. the enzyme. 

If the model comprises an associated ligand binding domain, for example a ligand binding domain in 
association with a ligand or substrate molecule or analogue thereof, the selected site may be the ligand binding 
10 domain or a site made up of the ligand binding domain and the complexed ligand, or a site on the ligand itself. The 
test compound may be investigated for its capacity to modulate the interaction with the associated molecule. 

A test compound (or plurality of test compounds) may be selected pn the basis of its similarity to a known 
ligand for the galactosyltransferase. For example, the screening method may comprise the following steps: 

(i) generating a computer model of a galactosyltransferase ligand binding domain in complex, with a 

15 . ligand; 

(ii) . searching for a test compound with a similar three dimensional structure and/or similar chemical 

groups; and 

- (iii) evaluating the fit of the test compound in the ligand binding domain. 
Searching may be carried out using a database of computer representations of potential compounds, using 

20 methods known in the art. 

The present invention also provides a method for designing ligands for a galactosyltransferase. It is well 
known in the art to use a screening method as described above to identify a test compound with promising" fit, but 
then to use this test compound as a starting point to design a ligand with improved fit to the model. A known 
modulator can also be modified to enhance its fit with a model of the invention. Such techniques are known as 

25 "structure-based ligand design" (See Kuntz et al., 1994, Acc. Chem. Res. 27:117; Guida, 1994, Current Opinion in 
Struc. Biol. 4: 777; and Colman, 1994, Current Opinion in Struc. Biol. 4: 868, for reviews of structure-based drug 
-design and identification ;and Kuntz et al 1982, J. Mol. Biol. 162:269; Kuntz et al., 1994, Acc. Chem. Res. 27: 117; 
Meng et ah, 1992, J. Compt. Chem. 13: 505; Bohm, 1994, J. Comp. Aided Molec. Design 8: 623 for methods of 
structure-based modulator design). 

30 Examples of computer programs that may be used for structure-based ligand design are CAVEAT (Bartlett 

et al., 1989, in "Chemical and Biological Problems in Molecular Recognition", Roberts, S.M. Ley, S.V.; Campbell, 
N.M. eds; Royal Society of Chemistry: Cambridge, pp 1 82- 1 96); FLOG (Miller et al., 1 994, J. Comp. Aided Molec. 
Design 8:153); PRO Modulator (Clark et al., 1995 J. Comp. Aided Molec. Design 9:13); MCSS (Miranker and 
Karplus, 1991, Proteins: Structure, Function, and Genetics 8:195); and, GRID (Goodford, 1985, J. Med. Chem. 

35 28:849). 

The method may comprise the following steps: 

*(i) docking a model of a test compound with a model of a selected ligand binding domain; 
(ii) identifying one or more groups on the test compound which may be modified to improve their fit 
in the selected ligand binding domain; 
40 (iii) replacing one or more identified groups to produce a modified test compound model; and 
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(iv) docking the modified test compound model with the model of the selected ligand binding domain. 
Evaluation of fit may comprise the following steps: 

(a) mapping chemical features of a test compound such as by hydrogen bond donors or acceptors, 
hydrophobic/lipophilic sites, positively ionizable sites, or negatively ionizable sites; and 

(b) adding geometric constraints to selected mapped features. 

The fit of the modified test compound may then be evaluated using the same criteria. 

The chemical modification of a group may either enhance or reduce hydrogen bonding interaction, charge 
interaction, hydrophobic interaction, Van Der Waais interaction or dipole interaction between the test compound and 
the key amino acid residue(s) of the selected site. Preferably the group modifications involve the addition, removal, 
or replacement of substituents onto the test compound such that the substituents are positioned to collide or to bind 
preferentially with one or more amino acid residues that correspond to the key amino acid residues of the selected 
site. 

Identified, groups in a test compound may be substituted with, for example, alkyl, alkoxy, hydroxyl, aryl, 
cycloalkyl, alkenyi, alkynyl, thiol, thioalkyl,. thioaryl, amino, or halo groups. Generally, initial substinations are 
conservative, i.e., the replacement group will have approximately the same size, shape, hydrophobicity and charge as 
the original group. It should, of course, be understood that components known in the art to alter conformation should 
be avoided. 

If a modified test compound model has an improved fit, then it may bind to the selected site and be 
considered to be a "ligand". Rational modification of groups may be made with the aid of libraries of molecular 
fragments which may be screened for their capacity to fit into the available space and to interact with the appropriate 
atoms. Databases of computer representations of libraries of chemical groups are available commercially, for this 
purpose. 

A test compound may also be modified u in situ" (i.e. once docked into the potential binding domain), 
enabling immediate evaluation of the effect of replacing selected groups. The computer representation of the test 
compound may be modified by deleting a chemical group or groups, replacing chemical groups, or by adding a> 
chemical group or groups. After each modification to a compound, the atoms of the modified compound and 
potential binding site can be shifted in conformation and the distance between the modulator and the active site 
atoms may be scored on the basis of geometric fit and favourable complementary interactions between the 
molecules. This technique is described in detail in Molecular Simulations User Manual, 1995 in LUDI. 

Examples of ligand building and/or searching computer programs include programs in the Molecular 
Simulations Package (Catalyst), ISIS/HOST, ISIS/BASE, and ISIS/DRAW (Molecular Designs Limited), and 
UNITY (Tripos Associates). 

The "starting point" for rational ligand design may be a known ligand for the enzyme. For example, in 
order to identify potential modulators of a galactosyltransferase, a logical approach would be to start with a known 
ligand (for example a substrate molecule or inhibitor ) to produce a molecule which mimics the binding of the 
ligand. Such a molecule may, for example, act as a competitive inhibitor for the true ligand, or may bind so strongly 
that the interaction (and inhibition) is effectively irreversible. Such a method may comprise the following steps: 

(i) generating a computer model of a galactosyltransferase ligand binding domain in complex with a 
ligand; 

(ii) replacing one or more groups on the ligand computer model to produce a modified ligand; and 
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(iii) evaluating the fit of the modified ligand in the iigand binding domain. 

The replacement groups could be selected and replaced using a compound construction program which 
replaces computer representations of chemical groups with groups from a computer database, where the 
representations of the compounds are defined by structural coordinates. 
5 In an embodiment, a screening method is provided for identifying a ligand of a galactosyltransferase 

comprising the step of using the structural coordinates or model of a substrate molecule or component thereof, 
defined in relation to its spatial association with a galactosyltransferase structure or a ligand binding domain, to 
generate a compound that is capable of associating with the galactosyltransferase or ligand binding domain. 

The invention contemplates a method for the design of modulators for gaiactosyltransferases based on the 
10 three dimensional structure or model of a sugar nucleotide donor (or parts thereof) defined in relation to the three 
dimensional structure of a ligand binding domain. 

In accordance with particular aspects of the invention, a method is provided for designing potential 
inhibitors of a galactosyltransferase comprising the step of using the structural coordinates of uracil, uridine, or UDP 
of Table 5, 6, or 7 to generate a compound for associating with the active site of a ligand binding domain of a 
15 galactosyltransferase. The following steps are employed in a particular method of the invention: (a) generating a 
computer representation of uracil, uridine, or UDP defined by structural coordinates of Table 5, 6 or 7; (b) searching 
for molecules in a data base that are similar to the defined uracil, uridine, or UDP using a searching computer 
program, or replacing portions of the compound with similar chemical structures from a database using a compound 
building computer program. 

20 In another embodiment of the invention, a method is provided for designing potential inhibitors of a 

glycosyltransferase comprising the step of using the structural coordinates of UDP-Gal of Table 6, to generate a 
compound for associating with the active site of a galactosyltransferase. The following steps are employed in a 
particular method of the invention: (a) generating a computer representation of UDP-Gal defined by the structural 
coordinates of Table 6; (b) searching for molecules in a data base that are similar to the defined UDP-Gal using a 
25 searching computer program, or replacing portions of the compound with similar chemical structures from a 
■ database using a compound building computer program. 

The screening methods of the present invention may be used to identify compounds or entities that associate 
with a molecule that associates with a galactosyltransferase enzyme (for example, a substrate molecule). 

Compounds and entities (e.g. ligands) of a galactosyltransferase identified using the above-described 
30 methods may be prepared using methods described in standard reference sources utilized by those skilled in the art. 
For example, organic compounds may be prepared by organic synthetic methods described in references such as 
March, 1994, Advanced Organic Chemistry: Reactions, Mechanisms, and Structure, New York, McGraw Hill. (See 
detailed discussion herein.) 

Test compounds and ligands which are identified using a model of the present invention can be screened in 
35 assays such as those well known in the art. Screening can be, for example, in vitro, in cell culture, and/or in vivo. 
Biological screening assays preferably centre on activity-based response models, binding assays (which measure 
how well a compound binds), and bacterial, yeast and animal cell lines (which measure the biological effect of a 
compound in a cell). The assays can be automated for high capacity-high throughput screening (HTS) in which 
large numbers of compounds can be tested to identify compounds with the desired activity. The biological assay, 
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may also be an assay for the ligand binding activity of a compound that selectively binds to the ligand binding 

domain compared to other enzymes. 

Ligands/Modulators 

The present invention provides a ligand or compound or entity identified by a screening method of the 
present invention. A ligand or compound may have been designed rationally by using a model according to the 
present invention. A ligand or compound identified using the screening methods of the invention specifically 
associate with a target compound. In the present invention the target compound may be a galactosyltransferase or a 
molecule that is capable of associating with a galactosyltransferase (for example a substrate molecule). In a 
preferred embodiment the ligand is capable of binding to the ligand binding domain of a galactosyltransferase. 

A ligand or compound identified using a screening method of the invention may act as a "modulator", i.e. a 
compound which affects the activity of a galactosyltransferase. A modulator may reduce, enhance or alter the 
biological function of a galactosyltransferase. For example a modulator may modulate the capacity of the enzyme to 
transfer a sugar from a nucleotide sugar donor to a specific hydroxyl of various saccharide acceptors that leads to the 
formation of a new glycosidic linkage. An alteration in biological function may be characterised by a change in 
specificity. For example, a modulator may cause the enzyme to accept a different substrate molecule, to transfer a 
different sugar, or to work with a different metal cofactor. In order to exert its function, the modulator commonly 
binds to'the ligand binding domain. 

A modulator which is capable of reducing the biological function of the enzyme may also be known as an 
inhibitor. Preferably an inhibitor reduces or blocks the capacity of the enzyme to form new glycosidic linkages. The 
inhibitor may mimic the binding of a substrate molecule, for example, it may be a substrate analogue. A substrate 
analogue may be designed by considering the interactions between the substrate molecule and the enzyme (for 
example by using information derivable from a model of the invention) and specifically altering one or more groups. 

In a highly preferred embodiment, a modulator acts as an inhibitor of a galactosyltransferase and is capable 
of inhibiting N- or Oglycan biosynthesis. 

The present invention also provides a method for modulating the activity of a galactosyltransferase within a 
cell using a modulator according to the present invention. . It would be possible to monitor the expression of N- 
glycans on the cell surface following such treatment by a number of methods known in the art (for example by 
detecting expression with an N-and O-glycan specific antibody). 

In another preferred embodiment, the modulator modulates the catalytic mechanism of the enzyme. 
A modulator may be an agonist, partial agonist, partial inverse agonist or antagonist of a 
galactosyltransferase or a ligand binding domain. 

The term "agonist" includes any ligand, which is capable of binding to a ligand binding domain and which 
is capable of increasing a proportion of active enzyme, resulting in an increased biological response. The term 
includes partial agonists and inverse agonists. 

The term "partial agonist" includes an agonist that is unable to evoke the maximal response of a biological 
system, even at a concentration sufficient to saturate a specific ligand binding domain. 

The term "partial inverse agonist" includes an inverse agonist that evokes a submaximal response to a 
biological system, even at a concentration sufficient to saturate a specific ligand binding domain. At high 
concentrations, it will diminish the actions of a full inverse agonist. 
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The invention relates to a ga3actosyltransferase ligand binding domain antagonist, wherein said ligand 
binding domain is that defined by the amino acid structural coordinates described herein. For. example the ligand 
may antagonise the inhibition of galactosyltransferase by an inhibitor. 

The term "antagonist" includes any agent that reduces the action of another agent, such as an agonist. The 
5 antagonist may act at the same site as the agonist (competitive antagonism). The antagonistic action may result from 
a combination of the substance being antagonised (chemical antagonism) or the production of an opposite effect 
through a different binding site (functional antagonism or physiological antagonism) or as a consequence of 
competition for the binding site of an intermediate that links the enzyme to the effect observed (indirect antagonism). 

The term "competitive antagonism" refers to the competition between an agonist and an antagonist for a 
10 ' ligand binding domain that occurs when the binding of agonist and antagonist becomes mutually exclusive. This 
may be because the agonist and antagonist compete for the same binding site or combine with adjacent but 
overlapping sites. A third possibility is that different sites are involved but that they influence the same 
macromolecules in such a way that agonist and antagonist molecules cannot be bound at the same time. If the 
agonist and antagonist form only short lived combinations with the binding site so that equilibrium between agonist, 
15 antagonist and binding site is reached during the presence of the agonist, the antagonism will be surmountable over a 
wide range of concentrations. In contrast, some antagonists, when in close enough proximity to their binding site, 
may form a stable covalent bond with it and the antagonism becomes insurmountable when no spare receptors 
remain. 

As mentioned above, an identified ligand or compound may act as a ligand model (for example, a template) 

20 for the development of other compounds. A modulator may be a mimetic of a ligand or ligand binding domain. A 
mimetic of a ligand may compete with a natural ligand for a galactosyltransferase and antagonize a physiological 
effect of the enzyme in an animal. A mimetic of a ligand may be an organically synthesized compound. A mimetic 
of a ligand binding domain, may be either a peptide, polysaccharide, oligosaccharide, or other biopharmaceutical 
(such as an organically synthesized compound) that specifically binds to a natural substrate molecule for a 

25 galactosyltransferase and antagonize a physiological effect of the enzyme in an animal. 

Once a ligand has been optimally selected or designed, substitutions may then be made in some of its atoms 
or side groups in order to improve or modify its binding properties. Generally, initial substitutions are conservative, 
i.e., the replacement group will have approximately the same size, shape, hydrophobicity and charge as the original 
group. It should, of course, be understood that components known in the art to alter conformation should be avoided. 

30 Such substituted chemical compounds may then be analyzed for efficiency of fit to a galactosyltransfease ligand 
binding domain by the same computer methods described above. 

Preferably, positions for substitution are selected based on the predicted binding orientation of a ligand to a 
galactosyltransferase ligand binding domain. 

A technique suitable for preparing a modulator will depend on its chemical nature. For example, organic 

35 compounds may be prepared by organic synthetic methods described in references such as March, 1994, Advanced 
Organic Chemistry: Reactions, Mechanisms, and Structure, New York, McGraw Hill. Peptides can be synthesized 
by solid phase techniques (Roberge JY et al (1995) Science 269: 202-204) and automated synthesis may be 
achieved, for example, using the ABI 43 1 A Peptide Synthesizer (Perkin Elmer) in accordance with the instructions 
provided by the manufacturer. Once cleaved from the resin, the peptide may be purified by preparative high 

40 performance liquid chromatography (e.g., Creighton (1983) Proteins Structures and Molecular Principles, WH 
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Freeman and Co, New York NY). The composition of the synthetic peptides may be confirmed by amino acid 
analysis or sequencing (e.g., the Edman degradation procedure; Creighton, supra). 

If a modulator is a nucleotide, or a polypeptide expressable therefrom, it may be synthesized, in whole or in 
part, using chemical methods well known in the art (see Caruthers MH et al (1980) Nuc Acids Res Symp Ser 215- 
5 23, Horn T et al (1980) Nuc Acids Res Symp Ser 225-232), or it may be prepared using recombinant techniques well 
known in trie art. , 

Direct synthesis of a ligand or mimetics thereof can be performed using various solid-phase techniques 
(Roberge JY et al (1995) Science 269: 202-204) and automated synthesis may be achieved, for example, using the 
ABI 43 1 A Peptide Synthesizer (Perkin Elmer) in accordance with the instructions provided by the manufacturer. 
10 Additionally, the amino acid sequences obtainable from the ligand, or any part thereof, may be altered during direct 
synthesis and/or combined using chemical methods with a sequence from other subunits, or any part thereof, to 
produce a variant ligand. 

In an alternative embodiment of the invention, the coding sequence of a ligand or mimetics thereof may be 
synthesized, in whole or in part, using chemical methods well known in the art (see Caruthers MH et al (1980) Nuc 
15 Acids Res Symp Ser 2 1 5-23 , Horn T et al (1 980) Nuc Acids Res Symp Ser 225-232). 

A wide variety of host cells can be employed for expression of the nucleotide sequences encoding a ligand 
of the present invention. These cells may be both prokaryotic and eukaryotic host cells. Suitable host cells include 
bacteria such as E. coli, yeast, filamentous fungi, insect cells, mammalian cells, typically immortalized, e.g., mouse, 
CHO, human and monkey cell lines and derivatives thereof. Preferred host cells are able to process the expression 
20 products to produce an appropriate mature polypeptide. Processing includes but is riot limited to glycosylation, 
ubiquitination, disulfide bond formation and general post-translational modification. 

In an embodiment of the present invention, the ligand may be a derivative of, or a chemically modified 
ligand. The term "derivative" or "derivatised" as used herein includes the chemical modification of a ligand. 

A chemical modification of a ligand and/or a key amino acid residue of a ligand binding domain of the 
25 present invention may either enhance or reduce hydrogen bonding interaction, charge interaction, hydrophobic 
interaction, Van Der Waals interaction or dipole interaction between the ligand and the key amino acid residue(s) of 
a galactosyltransferase ligand binding domain. 

Preferably such modifications involve the addition of substituents onto a test compound such that the 
- substituents are positioned to collide or to bind preferentially with one. or more amino acid residues that correspond 
30 to the key amino acid residues of a galactosyltransferase ligand binding domain. Typical modifications may include, 
for example, the replacement of a hydrogen by a halo group, an alkyl group, an acyl group or an amino group. 

The invention also relates to classes of modulators of galactosyltransferase based on the structure and shape 
of a substrate, defined in relation to the substrate's molecule's spatial association with a galactosyltransferase 
structure of the invention or part thereof. Therefore, a modulator may comprise a substrate molecule having the 
35 shape or structure, preferably the structural coordinates, of a substrate molecule in an active site binding pocket of a 
reaction catalyzed by a galactosyltransferase. 

Modulators Based on the 3D Structure of a Nucleotide Sugar Donor 

One class of modulators defined by the invention are compounds of the following formula I having the 
structural coordinates of uracil of Table 5, preferably Run 9, Cluster 1 or ATOM 1 to ATOM 9, inclusive of Table 7: 

40 
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wherein Ri and R 2 are each independently hydrogen, alkyl, cycloalkyl, alkenyl, alkynyl, heterocyclic 
rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, thioaryl, amino, halogen, carboxylic acid or esters or 
thioesters thereof, amines, sulfate, sulfonic or sulfmic acid or esters thereof, phosphate, pyrophophate, 
5 gallic acid, phosphonates,- thioamide, and -OR n where Ri 2 is alkyl, cycloalkyl, alkenyl, alkynyl, or 

heterocyclic ring; 

and salts and optically active and racemic forms of a compound of the formula I. 

Another class of modulators defined by the invention are compounds of the following formula II having the 
structural coordinates of uridine of Table 5, preferably Run 9, Cluster 1 or ATOMs 1 to 20 inclusive, of Table 7: 




10 

wherein R h R 2 , R 3 , R4, and R 5 are each independently hydrogen, alkyl, cycloalkyl, alkenyl, alkynyl, 
heterocyclic rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, thioaryl, amino, halogen, carboxylic acid or 
esters or thioesters thereof, amines, sulfate, sulfonic or sulfinic acid or esters thereof, phosphate, 
pyrophosphate, gallic acid, phosphonates, thioamide, and -ORn where R I2 is alkyl, cycloalkyl, 
15 alkenyl, alkynyl, or heterocyclic ring, 

and salts and optically active and racemic forms of a compound of the formula II. 

Yet another class of modulators defined by the invention are compounds of the following formula III 
having the structural coordinates of UDP in Table 5, preferably Run 9, Cluster 1, or ATOMs 1 to 28 inclusive of 
Table 7: 

20 



BNSDOCID: <WO 01837 17A2_I_> 



WO 01/83717 



PCT/CA01/00607 



- 28 - 



O 



O O 
R 6 0-Lo-P-0 




o 



10 



15 



wherein R u R 2 , R3, &a> ^ and R„ are each independently hydrogen, alkyl, cycloalkyl, alkenyl, 
alkynyl, heterocyclic rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, thioaryl, amino, halogen, carboxylic 
acid or esters or thioesters thereof, amines, sulfate, sulfonic or sulfmic acid or esters thereof, phosphate, 
gallic acid, phosphonates, thioamide, and -OR 12 where R 12 is alkyl, cycloalkyl, alkenyl, alkynyl, or 
heterocyclic ring, Re may be a monosaccharide or disaccharide, preferably a monosaccharide, including 
galactose, glucose, and mannose, 

and salts and optically active and racemic. forms of a compound of the formula III. 

Yet another class of modulators defined by the invention are compounds of the following formula IV 

having the structural coordinates of UDP-Gal in Table 6, preferably Run, Cluster 1 : 



wherein R,, R 2 , R 3 , Ri, R 7 , Rg, R9, and R 10 are each independently hydrogen, alkyl, cycloalkyl, alkenyl, 
alkynyl, heterocyclic rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, thioaryl, amino, halogen, carboxylic 
acid or esters or thioesters thereof (e.g. -CH 2 OH), amines, sulfate, sulfonic or sulfmic acid or esters 
thereof, phosphate, gallic acid, phosphonates, thioamide, and -OR, 2 where R 12 is alkyl, cycloalkyl, 
alkenyl, alkynyl, or heterocyclic ring, and X is a counter-ion including sodium, lithium, potassium, 
calcium, magnesium, manganese, cobalt ions and the like, as well as nontoxic ammonium, quaternary 
ammonium, and amine cations, preferably Mn 2+ , 
and salts and optically active and racemic forms of a compound of the formula IV. 
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One or more of R b R 2 > R3, R4, R5, Re, R? s Rs, R9, and/or R, 0 alone or together, which contain available 
functional groups as described herein, may be substituted with for example one or more of the following: alkyl, 
alkoxy, hydroxyl, aryl, cycloalkyl, alkenyl, alkynyl, thiol, thioalkyl, thioaryl, amino, or halo. The term "one or more" 
used herein preferably refers to from 1 to 2 substituents. 
5 > The present invention contemplates all optical isomers and racemic forms thereof of the compounds of the 

invention, and the formulas of the compounds shown herein are intended to encompass all possible optical isomers 
of the compounds so depicted. 

The present invention also contemplates salts and esters of the compounds of the invention. In particular, 
the present invention includes pharmaceutically acceptable salts. By pharmaceutical^ acceptable salts is meant those 

10 salts which are suitable for use in contact with the tissues of humans and lower animals without undue toxicity, 
irritation, allergic response and the like, and are commensurate with a reasonable benefit/risk ratio. Pharmaceutically 
acceptable salts are well known in the art and are described for example, in S. M. Berge, et ah, J. Pharmaceutical 
Sciences, 1977, 66:1-19. 
Compositions and Methods of Treatment 

15 The ligands and the modulators of the invention (e.g. inhibitors) may be used to modulate the biological 

activity of a galactosyltransferase in a cell, including modulating a pathway in a cell regulated by the 
galactosyltransferase or modulating a galactosyltransferase with inappropriate activity in a cellular organism. 

The present invention thus provides a method for treating a condition in a subject regulated by a 
galactosyltransferase or involving inapproproriate galactosyltransferase activity comprising administering to a 

20 subject an effective amount of a modulator identified using the methods of the invention. The invention still further 
relates to a pharmaceutical composition which comprises a three dimensional galactosyltransferase of the invention 
or a portion thereof (e.g. a ligand binding domain), or a modulator of the invention in an amount effective to regulate 
one or more of the above-mentioned conditions and a pharmaceutically acceptable carrier, diluent or excipient. 

The invention also provides the use of a ligand or modulator according to the invention in the manufacture 

25 of a medicament to treat and/or to prevent a disease in a patient. 

Inhibitors or antagonists of al,3-Gal transferase of the present invention may be particularly, useful in 
reducing xenotransplant rejection in an animal patient. Xenograft tissue may be treated with, or derived from an 
animal that has been treated with an inhibitor to decrease Gala(l,3) Gal epitopes on the xenograft tissue. This 
treatment will reduce or avoid an immune reaction between circulating antibodies in the transplant recipient reactive 

30 with the epitopes. Preferably the xenograft tissue is of pig origin and the xenograft recipient is a human. The 
xenograft tissue includes any tissue which expresses antigens having Gala(l,3)GaI epitopes. The tissue may be in 
the form of an organ, for example a kidney, heart, lung, or liver, or it may be in the form of parts of organs, cell 
clusters, glands and the like (e.g. lenses, pancreatic islet cells, skin, and corneal tissue). 

The modulators of the invention may be converted using customary methods into pharmaceutical 

35 compositions. The pharmaceutical compositions contain the modulators either alone or together with other active 
substances. Such pharmaceutical compositions can be for oral, topical, rectal, parenteral, local, inhalant, or 
intracerebral use. They are therefore in solid or semisolid form, for example pills, tablets, creams, gelatin capsules, 
capsules, suppositories, soft gelatin capsules, liposomes (see for example, U.S. Patent Serial No. 5,376,452), gels, 
membranes, and tubelets. For parenteral and intracerebral uses, those forms for intramuscular or subcutaneous 

40 administration can be used, or forms for infusion or intravenous or intracerebral injection can be used, and can 
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therefore be prepared as solutions of the modulators or as powders of the modulators to be mixed with one or more 
pharmaceutically acceptable excipients or diluents, suitable for the aforesaid uses and with an osmolarity which is 
compatible with the physiological fluids. For local use, those preparations in the form of creams or ointments for 
topical use or in the form of sprays should be considered; for inhalant uses, preparations in the form of sprays should 
5 be considered. 

The pharmaceutical compositions can be prepared by per se known methods for the preparation of 
pharmaceutically acceptable compositions which can be administered to patients, and such that an effective quantity 
of the active substance is combined in a mixture with a pharmaceutically acceptable vehicle. Suitable vehicles are 
described, for example, in Remington's Pharmaceutical Sciences (Remington's Pharmaceutical Sciences, Mack 
10 Publishing Company, Easton, Pa., USA 19S5). On this basis, the pharmaceutical compositions include, albeit not\ 
exclusively, the modulators in association with one or more pharmaceutically acceptable vehicles or diluents, and 
contained in buffered solutions with a suitable pH and iso-osrnotic with the physiological fluids. 

The modulators may be indicated as therapeutic agents either alone or in conjunction with other therapeutic 
agents or other forms of treatment. By way of example, inhibitors may be used in combination with anti-proliferative 
15 agents, antimicrobial agents, immunostimulatory agents, or antiinflammatories. The modulators may be 
administered concurrently, separately, or sequentially with other therapeutic agents or therapies. 

- The compositions containing modulators can be administered for prophylactic and/or therapeutic 
treatments. In therapeutic applications, compositions are administered to a patient already suffering from a condition 
as described above, in an amount sufficient to cure or at least alleviate the symptoms of the disease and its 
20 complications. An amount adequate to accomplish this is defined as a "therapeutically effective dose". Amounts 
effective for this use will depend on the severity of the disease, the weight and general state of the patient, the nature 
of the administration route, the nature of the formulation, and the time or interval at which it is administered. 

In prophylactic applications, compositions containing modulators are administered to a patient susceptible 
to or otherwise at risk of a particular condition. Such an amount is defined to be a "prophylactically effective dose". 
25 In this use, the precise amounts depend on the patient's state of health and weight, the^nature of the administration 
route, the nature of the formulation, and the time or interval at which it is administered. 

The following non-limiting examples illustrate the invention: 
Example 1 

The modeling of bovine oc-l,3-GalT was carried out using homology modeling procedures and a-l,3-GalT- 
30 ligand complexes were generated using automated' docking procedures. These computational modeling approaches 
allow fairly reasonable predictions of three-dimensional structures of proteins and their complexes with substrates and 
ligands thereby offering a rational way of investigating structure-function relationships (12). The amino acid sequence 
of o>l,3-GalT was obtained from a publicly available sequence data bank (13). 

Homology modeling. - The basic steps in the construction of a protein model based on a homologous structure are 
35 sequentially in the following order: amino acid sequence alignment, copying aligned coordinates, building loops, and 
refinement. The sequence alignment and secondary structure predictions were carried out using the Fold recognition 
server located at UCLA (14). The Molecular Simulations Inc. collection of programs was used for all protein 
modeling (15-17). The template structure chosen was the three-dimensional crystal structure (9) of SpsA determined 
at a resolution of 1.5 A. The initial alignment of oc-l,3-GalT and SpsA transferase sequences was obtained using the 
40 pair-wise alignment with the HOMOLOGY program (15). Multiple alignment of amino acid sequences was 
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performed using the Needleman and Wunch method (18). This method is capable to provide an optimum alignment of 
two sequences that represents the best overall balance between the number, of good amino acid matches and the least 
number of required gaps: When necessary, the initial pair-wise sequence alignments were manually modified to 
obtain structure-oriented alignments. After creating the alignment, the coordinates of the homologous regions were 

5 transferred from the SpsA structure to the bovine oc-l,3-GalT using the MODELER program (16). The geometry of 
the generated model was then locally optimized to remove steric side-chain clashes. The builder module of the 
Insightll program (1 7) was used to add hydrogen atoms to the enzyme and assign partial charges. 
Docking. - Structures of <x-l,3-GalT complexes with UDP, UDP-Gal, and a recently design inhibitor (19) were 
determined using the AutoDock suite of programs (20), which finds favorable docked configurations for a ligand in a 

10 protein-binding site starting from in an arbitrary conformation, orientation and position of a ligand molecule. 
AutoDock combines conformational search methods such as genetic algorithm and stochastic algorithm with a grid 
based energy calculation using molecular mechanics type force field, including electrostatic, hydrogen bonding, 
dispersion/repulsion, and solvation and entropic terms. The overall interaction between the enzyme and ligands were 
computed using the Amber-like force field as implemented in AutoDock (20). A Mn 2+ cation position was located, 

15 based on the SpsA structure, near the side chain of the Asp227, which belongs to the aspartate-valine-aspartate 
(DVD) sequence motif. An aspartate-any residue-aspartate (DXD) or the aspartate- any res idue-histidine (DXH) motif 
is common to many glycosyl transferases (21) and is involved in binding metal cations as well as its substrate. Water 
molecules were not considered in these computations. Positions of all protein atoms were fixed during the docking. 
The dihedral angles of all ligands were optimized while bond lengths and bond angles were restrained to standard 

20 values. Starting structure of UDP was obtained from SpsA-UDP complex and the UDP-Gal was generated using 
Insightll (17). The conformation of the ribose, galactose and uracil rings were fixed during the docking. In the present 
work a genetic algorithm was used as the search method. One hundred docking runs were performed for generating 
complexes of a-l,3-GalT with each of the chosen ligands. For each docking simulation, the population size was set to 
50 and 27,000 generations were run. The docked models are clustered using a root mean square tolerance value of 1.5 

25 A. This approach has been successfully used for a wide variety of structural problems and has been fully described 
elsewhere (20). 
Results and Discussions 

Homology model of a-l,3-GalT. - The amino acid sequence alignment of a-l,3-GalT with SpsA and homologous 
proteins are shown in Figure 1. The highest scoring alignment shows about 40% similarity and 20% identity (45 

30 amino acids are identical). The amino acid residues of SpsA that interact with UDP or located within the UDP binding 
site are underlined. A clear sequence similarity can be noticed at the active site regions of SpsA and the corresponding 
aligned residues of oc-l,3-GalT. In this figure it can be seen that the residues are well conserved in the region that 
encompasses the putative UDP binding pocket of SpsA. Table 3 shows the predicted secondary structures for the a- 
1 ,3-GalT sequence that was used for generating a homology model of oc-l,3-GalT. 

35 The homology model of a-l,3-GalT consists of two compact domains. The predicted N-terminal domain has 

about 100 residues starting at Gln-125 and ends at Gln-231 and the C-terminus domain has the remaining modeled 
residues. Figure 2 shows a superposition of the a-l,3-GalT model (blue) and the corresponding SpsA structure 
(magenta). The amino acid residues of SpsA that interact with the UDP ligand are shown as tubes. The corresponding 
amino acid residues of a-l,3-GalT are shown as thin tubes. In addition to this overlap at the active site, several exo- 
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site residues are homologous and placed in similar positions in the three-dimensional space. It can be seen from 
Figure 2 that the modeled a-l,3-GalT i s a compact structure similar to that of SpsA. The overall size of the model of 
ct-l,3-GalT is about 50 A x 45 A x 40 A. The (<j>,H/) angles of the constructed model are well within the allowed 
region of the Ramachandran maps (22). The UDP binding site is identified at the cleft between the strands of 
5 conserved residues and an alpha helix within this domain. This site is very deep and is highly electronegative in 
nature. The active site consists of an open a,[3-sandwich made up of three helices packed against four standard p- 
sheets. The general topology of the modeled a-l,3-GalT resembles those of GnT I and SpsA with the secondary 
structural elements similarly arranged in space. The following amino acid residues have been identified to be part of 
the UDP docking pocket of a-l 3 3-GalT: Phe-134, Tyr-139, He- 140, Val-136, Arg-194, Arg-202 9 Lys-209, Asp-173, 

10 His-218, Thr-137, Asp-225, Val-226, and Asp-227. The modeled catalytic domain has a core structure common to 
most of the known transferases (9-1 1). Moreover, amino acid residues that are involved in the UDP-Gal recognition 
and in the catalytic mechanism are homologous both in sequence and spatial relationship. As a consequence, the 
overall electrostatic property of the active site of the a-l,3-GalT is highly comparable with the UDP binding sites of 
GnTI and SpsA. Thus, the present analysis suggests that although the sequence homologies of SpsA, GnT I and ct- 

15 1,3-GalT are relatively low, they have a structurally conserved framework of about 100 residues that specifically 
recognize UDP. 

Complex of a-lJ-GalT with UDP and UDP-Gal, - In the GnT I, SpsA, and P4Gal Tl structures (9-1 1), the above- 
described architecture of the secondary structure elements specifically recognizes UDP. In these X-ray structures, a 
conserved aspartate (Asp39 in SpsA and Aspl44 in GnT I) generally interacts though the hydrogen bond interaction 

20 with the carbonyl at the 4 th position of the uracil ring. The carbonyl at the 2 nd position of the uracil favors charge 
interactions with the conserved His residue that resides at the bottom of the UDP pocket. The ribose ring packs with 
the conserved hydrophobic residue (Thr-9 in SpsA and Ile-1 13 in GnT I) that is located at the bottom of the pocket. 
In the model of ot-l,3-GalT, the metal binding site is located at one of the P-strands that contains the conserved DVD 
(Asp-225, Val-226 and Asp227) motif. These conserved residues are assumed to be located in the vicinity of the 

25 pyrophosphate-binding region. The C-terminal portion of the model has a confined groove, which has a stretch of 
charged residues. The docking studies described below suggest that this region can specifically recognize inhibitors, 
which are designed based on the acceptor substrate model (19). 

Simulation of the ct-l,3-GalT-UDP complexes,, using an automated docking procedure led to several 
complex structures that represent different binding modes of UDP, which were clustered to nine groups. Analysis of 

30 results revealed that in about 80% of the docking calculations, the UDP binds at the well-defined pocket located at the 
DVD motif. The low energy docking modes of UDP to the ct-l,3-GalT are shown in Figure 3. The a-l,3-GalT 
structure is presented in ribbon form and the amino acid residues that directly interact with UDP are labeled. Five top 
ranking clusters are characterized in Table 2 together with the computed binding energy and the estimated inhibition 
constant. Possible intermolecular contacts in the lowest energy complex are listed in Table 1 . In the top three clusters, 

35 UDP binds in the deep pocket generally in a similar conformation. This is illustrated in Figure 3, where the preferred 
binding mode is shown as a thick blue tube. Three hydrogen bonds that are possible between the uracil and a- 1,3- 
GalT characterize this binding mode. These are (1) the amide hydrogen of uracil in position 3 and OD1 of Asp-168, 
(2) the carbonyl oxygen of uracil in position 4 and the side chain of Lys-204, and (3) the carbonyl oxygen of uracil in 
position 2 and the amide hydrogen of the His-213 side chain. The hydroxyl groups at the 2 and 3 positions of the 
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ribose ring forms three hydrogen bonds with the Asp-225 side chain oxygens. The pyrophosphate oxygens interact 
with the Asp-227 side chain through the metal ion. Apart from these hydrogen bond interactions many favorable 
hydrophobic interactions are possible between the uridine and the protein. It is clear from Table 1 that the bound UDP 
generally favors interactions with conserved amino acid residues of the enzyme. However, some of the residues that 
5 do not interact directly with UDP but lie in the close vicinity of the UDP docked region are Tyr-139, Ile-140, Val- 
136, Arg-194, Asp-197, Ile-198, Arg-202, Lys-204, His209 and His-213. It is noteworthy that some of these residues 
such as Tyrl39, Asp-197 are conserved across various species (8). It is possible that these active site side chains may 
be involved in direct binding interactions with UDP. 

The lowest energy cluster consists of about 30% of all the docking runs. The analysis of the other low energy 
10 clusters that represent about 70% of docked structures clearly shows that many of the docking modes were very close 
to the lowest energy-binding mode. However, small variations in the nature of local interactions between the 
pyrophosphate part and the enzyme were observed. It can be seen from Figure 3 that the 5 and 6 positions of the 
uracil ring are exposed to the solvent and the remaining positions of the uracil fragment are in contact with the 
protein. 

15 The structure of the UDP-Gal complex with o>l,3-GalT has been generated using the approach described 

above. Figure 4 shows the low energy binding modes of this complex. The comparison of the a-l,3-GalT complexes 
with UDP and UDP-Gal reveals that the uridine portion of the UDP-Gal assumes a similar binding orientation as in 
the case of the ct-l,3-GalT-UDP complex. These results suggest that the addition of the galactopyranose residue to 
UDP does not alter the binding mode of the uridine, which is tightly bound in the active site. On the contrary, the 

20 pyrophosphate is more flexible and its conformation alters upon addition of this monosaccharide unit to the UDP. 
These data indicate that the design of an inhibitor based on the docking sites of pyrophosphate and donor sugar group 
fragments of UDP-Gal should consider the possible conformational flexibility of the pyrophosphate group and the 
corresponding diversity associated with binding interactions. 

In the crystal structure of the complex of SpsA with UDP, the UDP is bound at the active site of the enzyme 

25 (8). The uracil ring of the bound UDP is placed into the cavity where its carbonyl and amide hydrogens form two 
hydrogen bonds with side-chains of Arg-71 and Asp-39, respectively. Apart from these hydrogen bond interactions, a 
favorable stacking interaction between the uracil ring and side chain of Tyr-11 is possible. A strong hydrogen bond 
interaction is possible between the hydroxy] of ribose in the position 3 and the side chain oxygen of Asp-99. The 
pyrophosphate conformation is confined to a particular orientation due to the favorable charge interactions with the 

30 bound metal ion. Unligil et al (10) has solved a structure of GnT I complexed with UDP-GlcNAc at 1.5 A resolution. 
In this crystal structure of the GnT I complex, the uracil ring favors a similar interaction, as observed in the SpsA 
complex, with the nucleotide binding domain residues consisting of a Lys and an Asp. The ribose portions of the UDP . 
bind into the hydrophobic rich region of the GnT I and thereby gains a stacking energy, thus, these two structures 
possess a clear structural and sequence similarity at the UDP binding pocket. However, overall there is no sequence 

35 homology between the two proteins. The bound UDP conformation is very similar in these structural complexes. 
These data suggest that amino acid conservation at the UDP binding pocket is important for the precise recognition of 
UDP ligands. The homology model of a-l,3-GalT contains these critical amino acids at the identified pocket of the 
enzyme (Figures 2 and 3). The top ranking docked complexes are in agreement with reported X-ray structures of 
glycosyltransferases (7, 9, and 1 1). This suggests that a part of the substrate binding pocket in glycosyltrasferases is 

40 specifically tailored to bind UDP. It is evident from the computed docking models that the binding modes of UDP 
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generally favor a standard type of interaction with the enzyme. In the predicted low energy complexes of UDP and 
UDP-Gal with o>l,3-GalT, the DVD motif of the enzyme interacts with pyrophosphate through the modeled metal 
cation. 

Binding mode of an inhibitor to a- 1 ,3-GalT 
5 Recently, an inhibitor based on the acceptor of a-l,3-GalT has been designed (19). This compound has a 

disaccharide linked to a bromine substituted naphthamide ring. It has been shown that the removal of the terminal 
sugar unit in this inhibitor does not inhibit a-l,3-GalT, but instead inhibits p-l,4-GalT. Thus, the determination of the 
binding mode of this inhibitor to a-l,3-GalT might provide a stereochemical explanation for the observed binding 
affinities. Using the above described docking procedure, this synthetic inhibitor was docked to the surface of a- 1,3- 

10 GalT. Docking simulations produced two distinct favorable regions for this molecule located in the active site of the 
enzyme. In the one, the inhibitor occupies the UDP binding site. Generally, in this low energy binding mode the 
inhibitor is placed well in the uridine pocket. The second largest cluster of conformations is located at the acceptor 
site. Figure 5 shows the computed binding mode of the inhibitor at the acceptor-binding region of the protein. In this 
binding mode, the terminal saccharide binds close to the Asp-227 side chain and the bulky aromatic group of the 

15 inhibitor interacts with the side chain of of Ue-283. The bromide atom is located close to the side chain of Asp-227 
and the naphthamide ring is placed on the top of Met-224 side chain. It can be seen that the inhibitor not only 
occupies the acceptor-binding region of the protein but also has considerable interactions at the donor site of the 
enzyme. Thus, these predicted binding modes of inhibitor could explain its inhibitory activity. 

Figures 6 to 9 also show models of a-l,3-GalT and ligand binding domains of the enzyme. 

20 Conclusions 

Using a combination of homology modeling and molecular docking approaches, the oc-l,3-GaIT structure 
and its complexes with UDP, UDP-Gal, and a synthetic inhibitor have been modeled. The predicted N-terminal 
domain of the of the a-l,3-GalT has about 100 residues that start at Gln-125 and end at Gln-131. The overall 
secondary structure arrangements, amino acid properties, spatial arrangement of critical amino acid residues and size 

25 of this domain are highly comparable with other GnT structures. The predicted pocket on this domain surface of o> 
1,3-GalT specifically recognizes UDP in a unique binding mode. Structural analysis and comparative studies of the 
modeled binding site with the GnT I and SpsA structures suggested the high degree of similarity at the UDP binding 
pocket. This implies a possible structural homology in glycosyltransferases in spite of their low sequence identity and 
homology. Thus the modeled bovine structure of o>l,3-GalT provides a framework to better understand the functional 

30 , and structural similarities between galactosyltransferases. 

While the present invention has been described with reference to what are presently considered to be the 
preferred examples, it is to be understood that the invention is not limited to the disclosed examples. To the contrary, 
the invention is intended to cover various modifications and equivalent arrangements included within the spirit and 
scope of the appended claims. 

35 All publications, patents and patent applications are herein incorporated by reference in their entirety to the 

same extent as if each individual publication, patent or patent application was specifically and individually indicated 
to be incorporated by reference in its entirety. 
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Table 1 

Atomic Interactions between GalT and UDP 



r\ lO HI 1L 




ALOITIiL CUIllaCl 


JLMislallCt X»CLYVcL.II 


iiaiurc oi 


Interaction 


Contact on UDP 


on GalT 


Atomic Contacts 


Interaction 








on GalT and UDP 




1 


Uracil NH 


Asp-16SODl 


2.1 ±0.5 


HB 


2 


Uracil Ol 


Lys-204 HZ1 


3.0 ± 0.5 


HB 


3 


Uracil 02 


His-213NE2 


2.7 + 0.5 


HB 


4 


Uracil Ring 


Phe 134 Ring 


4.2 ±0.5 


HP 


5 


Ribose OH2 


Asp-225 OD2 


2.2 ± 0.5 


HB 


6 


Ribose OH3 


Asp-225 0D2 


2.5 ±0.5 


HB 


7 


Ribose ring 


Leu 131 


4.1 ±0.5 


HP 


8 


Ribose Ring 


Ile-210 


4.0 ± 0.5 


HP 


9 


Ola (Diphosphate) 


Asp-225 OD2(Mn) 


4.6 ± 0.5 


MM 


10- 


Ola (diphosphate) 


Asp-227 OD2(Mn) 


4.5 ± 0.5 


MM 


11 


02b (diphosphate) 


Asp-227 OD2(Mn) 


5.1 ±0.5 


MM 



HB: hydrogen bond interaction 
MM: metal mediated interaction 
HP: hydrophobic interaction 
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Table 2 

Characterization of the Top Five Binding Modes of UDP to the a-l,3-GaIT 
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041 
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.020 
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.499 


25. 


035 
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49, 
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0 


.725 
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48 


.514 
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2224 


CA 


THR 


263 
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.479 


19. 


.134 


48. 


.583 
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HA 


THR 


263 


-0 


.713 


18 


. 443 


49, 


.392 


45 
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2226 


CB 


THR 


263 


-1 


.753 


19 


.421 


47 


. 846 




ATOM 


2227 


HB 


THR 


263 


-1 


.533 


20 


.090 


47 


.014 




ATOM 


2228 


OG1 


THR 


263 


-2 


.686 


20 


.055 


48 


.709 




ATOM 


2229 


HG1 


THR 


263 


-2 


.364 


21 


. 009 


48 


. 923 




ATOM 


2230 


CG2 


THR 


263 


-2 


.331 


18 


.096 


47 


. 324 


50 


ATOM 


2231 


HG2 


THR- 


263 


-3 


.259 


18 


.290 


46 


.786 




-. ATOM 


2232 


HG2 


THR 


263 


-1 


.614 


17 


. 625 


46 


. 652 




ATOM 
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HG2 


THR 


263 


-2 


.531 


17 


.430 


48 


.164 
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2234 


C 


THR 


263 


0 


.435 


18 


.451 


47 


:613 




ATOM 


2235 


O 


THR 


263 


1 


.217 


19 


.096 


46 


.917 


55 
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2236 


N 


TYR 


264 


0 


.361 


17 


.101 


47 


.562 
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2237 


HN 
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264 


-0 


.251 


16 


. 601 


48 


.221 




ATOM 


2238 


CA 
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264 


1 


.128 


16 


.359 


46 


. 603 




ATOM 


2239 


HA 
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264 
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.472 
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.855 
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CB 
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HA 


GLU 


265 
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208 
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CB 
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429 
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120 
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871 
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-1 . 
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42 . 


070 
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42. 


841 
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872 


41 . 
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CD 


GLU 
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966 
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028 
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434 
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486 
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CA 
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43. 


523 
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340 
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CB 
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43 . 


540 
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, 451 
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43. 


275 
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266 
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. 202 


42 . 


848 


^0 
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CG 


ARG 


266 
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. 517 
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44 . 


901 
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-2 . 
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. 648 


45 . 
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♦ATOM 
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HG2 
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-0 , 


. 805 


8 . 


,762 


45. 


723 
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on 
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-2 . 


. 361 


7 , 
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44 . 


981 




ATOM 


228 4 


HD1 
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-1 , 


.734 


6 , 


.737 


44 . 


, 651 




ATOM 


2285 
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266 


-3 . 


. 220 


7 . 


.702 


44 . 


, 324 




ATOM 
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NE 


ARG 


266 


-2 , 


.783 


7 , 
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46. 


, 400 
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HE 
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266 


-2 


. 869 
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. 221 


47 . 


, 013 




ATOM 


2288 
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266 


-3 , 


. 055 


6 . 
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46. 


, 886 




ATOM 


2289 
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266 


-2 


. 958 


5 


. 065 


46. 


, 067 


AO 
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2 66 
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' 4 
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46, 


. 433 
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266 


-2 


. 67 9 
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.083 
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266 


-3 
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. 992 


48 . 


. 193 
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HH2 


ARG 


266 


-3 
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5 


.052 


48, 


.560 
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266 


-3 


. 478 
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.812 


48 . 


. 813 
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. 881 


10 


.104 


42 , 


. 208 




ATOM 


2296 


0 
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266 


0 


. 463 


10 


.727 


41 , 


.235 




ATOM 


2297 


N 
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267 


1 


. 979 


9 


. 329 


42 . 


. 154 




ATOM 


2298 


HN 
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267 


2 


.288 


8 


.803 


42, 


. 984 




ATOM 


2299 


• CA 


ARG 


267 


2 


. 710 


9 


.246 


40, 


.932 


50 


ATOM 


2300 


HA 


ARG 


267 


2 


.245 


9 


. 909 


40, 


.202 




ATOM 


2301 


CB 


ARG 


. 267 
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. 190 


9 


. 606 


41 


. 128 




ATOM 


2302 


HB1 


ARG 


267 


4 


.777 


9 


.522 


40 


.213 




ATOM 


2303 


HB2 


ARG 


267 


4 


. 693 


8 


. 969 


41 


. 857 




' ATOM 


2304 


CG 


ARG 


267 


4 


.400 


11 


. 039 


41. 


. 622 


55 


ATOM 


2305 


HGl 


ARG 


267 


3 


. 662 


11 
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42 


.356 




ATOM • 


2306 


HG2 
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267 


4 


. 354 


11 


.787 


40 


.830 




ATOM 


2307 


CD 
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267 


5 


.754 


11 


.265 


42 


.299 




ATOM 
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HD1 
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267 


5 


. 861 


10 
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43 


.100 




ATOM 


2309 


HD2 
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267 


5 


.772 


12 
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42 


. 697 


60 


ATOM 


2310 
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Residue number will be set to the conformation's cluster rank. 
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Run = 83 

Cluster Rank = 1 

Number of conformations in this cluster = 30 



RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 



- 2.261 A 

-8.49 kcal/mol [-(l)+(3)] 
= +5.99e-07 [Temperature 
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(1) Final Intermolecular Energy = 

(2) Final Internal Energy of Ligand = 
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65 

Run = 65 

Cluster Rank - 1 

Number of conformations in this cluster « 30 
RMSD from reference structure «■ 2.304 A 



Estimated Free Energy of Binding 
Estimated Inhibition Constant/ Ki 

Final Docked Energy 



-8.71 kcal/mol [-(l)+(3) 3 

+4.12e-07 [Temperature = 298.15 K] 
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(1) Final Intermolecular Energy = 

(2) Final Internal Energy of Ligand = 

(3) Torsional Free Energy 
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NEWDPF ndihe7 

NEWDPF dihe0174.86 35.30 170.27 1.85 94.80 -103.65 115.10 
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14 

Run = 14 

Cluster Rank - 1 

Number of conformations in this cluster = 30 



RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 

(1) Final Intermolecular Energy 

(2) Final Internal Energy of Ligand 

(3) Torsional Free Energy 
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Run - 99 
Cluster Rank 



= 1 



Number of conformations in this cluster = 30 
RMSD from reference structure 



= 2.336 A 



Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki = 

Final Docked Energy ■* 

. (1) Final Intermolecular Energy = 

(2) Final Internal Energy of Ligand = 

(3) Torsional Free Energy - 



-8.47 kcal/mol [=(l) + (3) ] 
+6.23e-07 [Temperature 

-11.36 kcal/mol E-(l)+(2)] 

-10.65 kcal/mol 
-0.71 kcal/mol 
+2.18 kcal/mol 



298.15 K] 



DPF = test.dpf 

NEWDPF move udp__tr.pdbq 

NEWDPF aboutl6. 792999 18.735001 34.970001 
NEWDPF tran016. 837146 19.319611 35.006964 
NEWDPF quat0-0. 287528 -0.036292 0.957084 6.817381 
NEWDPF ndihe7 

NEWDPF dihe0179.27 74.01 -73.43 -63.66 -99.15 70.88 172.83 
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89 

Run =89 

Cluster Rank = 1 

Number of conformations in this cluster 



RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 

(1) Final Intermoiecular Energy 

(2) Final Internal Energy of Ligand 

(3) Torsional Free Energy 



30 

2.343 A 

-8.33 kcal/mol [=(1)4(3)3 

47.88e-07 [Temperature = 298.15 K] 

-11.35 kcal/mol [=(1)4(2)]' 

-10.51 kcal/mol 
-0.85 kcal/mol 
4-2.18 kcal/mol 



DPF = test.dpf 

NEWDPF move udp_tr.pdbq 

NEWDPF aboutl6. 792999 13.735001 34.970001 

NEWDPF tran017. 054940 19.477433 34.899250 

NEWDPF quatOO. 673805 0.287903 -0.680513 -10.385254 

NEWDPF ndihe7 

NEWDPF diheO-157.20 94.24 8.30 -47.60 -85.48 50.85 179. 
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Run =75 
Cluster Rank 



= 1 



Number of conformations in this cluster = 30 



RMSD from reference structure 

Estimated Free Energy of Binding 
-Estimated Inhibition Constant, Ki 

Final Docked Energy 



- 2.190 A 

-8.35 kcal/mol [-(l)+(3)] 

+7.52e-07 [Temperature = 298.15 K] 

- -11.34 kcal/mol [=(l)+(2)] 



(1) Final Intermolecular Energy = 

(2) Final Internal Energy of Ligand - 

(3) Torsional Free Energy = 



-10.53 kcal/mol 
-0.81 kcal/mol 
+2.18 kcal/mol 



DPF = test.dpf 
^.NEWDPF move udp_tr.pdbq 

" NEWDPF aboutl6.792999 18.735001 34.970001 
TtfEWDPF tran016. 649808 19.351573 34.884284 

NEWDPF quatOO. 238273 0.242155 -0.940525 -7.710898 

NEWDPF ndihe7 

NEWDPF dihe0162.51 45.31 -179.82 136.56 -34.17 0.93 124.87 



35 


USER 








Rank 


X 


y 




2 


vdW 


Elec 




q 


RMS 




ATOM 


1 


Nl 


UDP 


1 


18. 


047 


20. 


259 


33. 


.278 


-0. 


38 


-0. 


10 


-0 


.211 


2. 


.190 




ATOM 


2 


C2 


UDP 


1 


18. 


316 


21. 


566 


32. 


.981 


-0. 


84 


+ 0. 


28 


+0 


.396 


2 . 


.190 




ATOM 


3 


N3 


UDP 


1 


19. 


631 


21. 


879 


32, 


.746 


-0. 


54 


-0. 


40 


-0 


.440 


2 , 


.190 




ATOM 


4 


H3 


UDP 


1 


19. 


844 


22. 


864 


32. 


.537 


+0. 


04 


+ 0. 


71 


+ 0 


.440 


2. 


.190 


40 


ATOM 


5 


C4 


UDP 


1 


20. 


707 


20. 


990 


32. 


.764 


-0. 


74 


+ 0. 


26 


+0 


.396 


2. 


.190 




ATOM 


6 


C5 


UDP 


1 


20. 


358 


19. 


639 


33, 


.074 


-0. 


54 


+ 0. 


00 


+ 0 


.000 


2. 


.190 




ATOM 


7 


C6 


UDP 


1 


19. 


074 


19. 


323 


33. 


.299 


-0. 


48 


+ 0 . 


00 


4-0 


.000 


2. 


.190 




ATOM 


8 


02 


UDP 


1 


17. 


429 


22. 


420 


32, 


. 939 


-0. 


30 


-0. 


31 


-0 


.396 


2. 


.190 




ATOM 


9 


04 


UDP 


1 


21. 


832 


21. 


436 


32. 


.538 


-0. 


16 


-0. 


18 


-0 


.396 


2. 


.190 


45 


ATOM 


10 


Cl ? 


UDP 


1 


16. 


661 


19. 


859 


33. 


.555 


-0. 


64 


+ 0. 


06 


+ 0 


.324 


2. 


.190 




ATOM 


11 


C2' 


UDP 


1 


16. 


169 


18. 


663 


32. 


.736 


-0. 


65 


-0. 


01 


+ 0 


.113 


2. 


.190 




ATOM 


12 


C3 1 


UDP 


1 


15. 


019 


18. 


160 


33 


.638 


-0. 


68 


-0. 


01 


+0 


.113 


2, 


.190 




ATOM 


13 


C4 ? 


UDP 


1 


15. 


502 


18. 


477 


35, 


.053 


-0. 


56 


+0. 


03 


+ 0 


.113 


2. 


.190 




ATOM 


14 


04 1 


UDP 


1 


16. 


650 


19. 


352 


34 


.884 


-0. 


04 


-0. 


07 


-0 


.227 


2. 


.190 


50 


ATOM 


15 


02 1 


UDP 


1 


15. 


656 


19. 


090 


31, 


.496 


-0. 


23 


+0. 


22 


-0 


.537 


2, 


.190 




ATOM 


16 


H02 'UDP 


1 


15! 


"558 


18. 


281 


30. 


.868 


-0. 


28 


-0. 


45 


+ 0 


.424 


2, 


.190 




ATOM 


17 


03 ' 


UDP 


1 


13. 


870 


18. 


955 


33. 


.333 


-0. 


22 


+ 0. 


10 


-0 


.537 


2. 


.190 




ATOM 


18 


H03'UDP 


1 


14 . 


118 


19. 


662 


32. 


. 626 


-0. 


35 


-0. 


35 


+ 0 


.424 


2, 


.190 




ATOM 


19 


C5 1 


UDP 


1 


15. 


946 


17. 


234 


35. 


.825 


-0. 


35 


+0. 


04 


+ 0 


.113 


2, 


.190 


55 


ATOM 


20 


05 ? 


UDP 


1 


16. 


337 


17. 


645 


37. 


. 122 


+0. 


01 


-0. 


15 


-0 


.368 


2, 


.190 




ATOM 


21 


PA 


UDP 


1 


17. 


525 


17. 


236 


38 


.076 


-0. 


63 


+ 0. 


28 


+1 


.019 


2, 


.190 




ATOM 


22 


OlA 


UDP 


1 


18. 


643 


16. 


742 


37 


.223 


+0. 


08 


-0. 


09 


-0 


.255 


2. 


.190 




ATOM 


23 


02A 


UDP 


1 


17. 


.796 


18. 


371 


39 


.013 


-0. 


21 


-0. 


03 


-0 


.255 


2, 


.190 




ATOM 


24 


03A 


UDP 


1 


16. 


769 


16. 


000 


38 


.744 


-0. 


03 


-0. 


18 


-0 


.510 


2. 


.190 


60 


ATOM 


25 


PB 


UDP 


1 


15. 


718 


15. 


879 


39 


. 943 


-0. 


73 


+0. 


44 


+1 


.019 


2. 


.190 




ATOM 


26 


OIB 


UDP 


1 


14 . 


699 


i7. 


113 


39 


.721 


-0. 


15 


-0. 


12 


-0 


.255 


2 


.190 




ATOM 


27 


02B 


UDP 


1 


16. 


.601 


16. 


105 


41 


.115 


-0. 


56 


-0. 


24 


-0 


.255 


2 


.190 




ATOM 


28 


03B 


UDP 


1 


14 . 


.849 


14. 


67 6 


39 


.880 


+ 0. 


00 


-0. 


08 


-0 


.255 


2. 


.190 



TER 

ENDMDL 

MODEL 

USER 

USER 



34 

Run = 3 4 

Cluster Rank = 1 



wvsnnnrv <wo oiB3?i7A2 i > 



WO 01/83717 



PCT/CA01/00607 



108 



10 



15 



20 



55 



60 



65 



USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 
USER 



Number of conformations in this cluster 
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RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 

(1) Final Intermolecular Energy 

(2) Final Internal Energy of Ligand 

(3) Torsional Free Energy 



= 2.097 A 

-8.20 kcal/mol [=<l) + (3)] 

+9.82e-07 [Temperature » 298.15 K] 

= -11.33 kcal/mol [=(l) + {2)] 

= -10.38 kcal/mol 
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Run = 20 
Cluster Rank 
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Number of conformations in this cluster 
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RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 

(1) Final Intermolecular Energy 

(2) Final Internal Energy of Ligand 

(3) Torsional Free Energy 



= 2.190 A 

-8.37 kcal/mol [=(l> + (3>] 

+7.36e-07 [Temperature « 298. .15 K] 

= -11.31 kcal/mol [=(l)+(2)] 

= -10.55 kcal/mol 
-0.77 kcal/mol 
+2.18 kcal/mol 
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Run = 7 

Cluster Rank = 1 

Number of conformations in this cluster - 30 



RMSD from reference structure « 2.106 A 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 

(1) Final Intermolecular Energy 

(2) Final Internal Energy of Ligand 

(3) Torsional Free Energy 



- -8.01 kcal/mol [-(l)+(3)] 

+1.34e-06 [Temperature = 298.15 KJ 

= -11.14 kcal/mol [-(l)+{2)] 

= -10.19 kcal/mol 
-0.95 kcal/mol 
+2.18 kcal/mol 



DPF = test.dpf 

NEWDPF move udp_tr.pdbq 

NEWDPF aboutl6. 792999 18.735001 34.970001 
NEWDPF tran016. 771562 19.240141 34.663676 
NEWDPF quat0-0. 276654 -0.688269 0.670632 9.784323 
NEWDPF ndihe7 

NEWDPF dihe0179.04 77.47 173*47 135.89 -39.09 46.20 144.65 
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-0.211 2.106 
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Cluster Rank - 1 

Number of conformations in this cluster = 30 
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Run = 67 

Cluster Rank = 1 

Number of conformations in this cluster = 30 



RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 

(1) Final Intermolecular Energy 

(2) Final Internal Energy of Ligand 

(3) Torsional Free Energy 
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Run =68 

-Cluster Rank = 1 

Number of conformations in this cluster = 30 
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Run = 61 

Cluster Rank = 1 

Number of conformations in this cluster = 30 



RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 



Final Docked Energy - 

(1) Final Intermolecular Energy = 
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(3) Torsional Free Energy = 
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Number of conformations in this cluster = 30 



RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant-, Ki 
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Final Docked Energy . = 

(1) Final Intermolecular Energy 

(2) Final Internal Energy of Ligand = 

(3) Torsional Free Energy = 
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-0.8 6 kcal/mol 
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Number of conformations in this cluster = 30 



RMSD from reference structure 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 

Final Docked Energy 



= 2.223 A 

-7.66 kcal/mol [=(1) + (3)J 

+2.43e-06 [Temperature = 298.15 K] 

- -10.34 kcal/mol [-(l)+(2)] 



(1) Final Intermolecular Energy = -9.84 kcal/mol 

(2) Final Internal Energy of Ligand = -0.50 kcal/mol 

(3) Torsional Free Energy = +2.18 kcal/mol 
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NEWDPF move udp_tr.pdbq 
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NEWDPF quat0-0. 617758 -0.594434 0.514804 13.202837 
NEWDPF ndihe7 
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Table 6 

Residue number will be set to the conformation's cluster rank. 
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32 

Run = 32 

Cluster Rank = 1 

Number of conformations in this cluster = 3 



RMSD from reference structure = 2.229 A 

Estimated Free Energy of Binding 
Estimated Inhibition Constant, Ki 



-9.58 kcal/mol 
+9.46e-08 



E-(l) + (3) ] 

[Temperature = 298.15 



Final Docked Energy — 

(1) Final Intermolecular Energy * 

(2) Final Internal Energy of Ligand = 

(3) Torsional Free Energy « 



-13.09 kcal/moi [=(l)+(2)] 

-13.94 kcal/mol 

+0.85 kcal/mol 

+4 . 36 kcal/mol 



DPF = udp_gal.dpf 
NEWDPF move udp_gal.pdbq 
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NEWDPF tran015. 935308 17.497402 35.985764 
NEWDPF quat0-0. 511638 0.842288 -0.169640 -0.016065 
-NEWDPF ndihel4 

NEWDPF dihe00.72 72.20 174.47 61.19 -168.15 179.54 -19.00 -11.55 -110.12 -5.97 
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7\ rpOM 


264 8 


CA 


THR 


287 


1. 737 


9.563 


50. 617 
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TV TOM 
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WE CLAIM 

1. A model for a ligand binding domain of a galactosyl transferase. 

2. A model as claimed in claim 1 wherein the ligand binding domain is a binding domain for a 
disphosphate group of a sugar nucleotide donor, a nucleotide of a sugar nucleotide donor, a 
nitrogeneous heterocyclic base of a sugar nucleotide donor, a sugar of a nucleotide of a sugar 
nucleotide donor, a selected sugar of a sugar nucleotide donor that is transferred to an acceptor, or an 
acceptor. 

3. A model of a ligand binding domain as claimed in any of the preceding claims wherein the model 
comprises one or more of the amino acid residues shown in Table 1 or Figure 2, 3, or 4. 

4. A model of a ligand binding domain as claimed in claim 1 comprising hydrogen binding partners for 
the amide hydrogen, carbonyl oxygen in position 4 and the carbonyl oxygen of uracil. 

5. A model of a ligand binding domain as claimed in claim 1 that binds the uridine portion of UDP and 
comprises Phe-134, Tyr-139, Ile-140, Val-136, Arg-194, Arg-202, Lys-209, Asp-173, His-218, and 
ThM37. 

6. A model of a ligand binding domain as claimed in claim 1 that interacts with a pyrophosphate portion 
of UDP comprising Asp-225, Val-226, and Asp-227 of a galactosy Itransferase. 

7. A model or secondary, tertiary and/or quanternary structure of a galactosyltransferase for an ctl,3- 
galactosyltransferase. 

8. A model according to any preceding claims wherein the galactosyltransferase is characterized by the 
atomic contacts of a galactosyltransferase as shown in Table 1 . 

9. A model as claimed in claim 8 wherein the atomic contacts are defined by the structural coordinates of 
the atomic contacts as shown in Table 4 or Table 8. 

10. A model according to any preceding claims in association with a ligand or substrate. 

11. A model according to any preceding claims having the structural coordinates shown in Table 4 or Table 
S. 

12. A computer readable medium having stored thereon a model according to any preceding claim. 

13. A computerized representation of a model according to any of the preceding claims. 

14. A method of screening for a ligand capable of binding a ligand binding domain of a 
galactosyltransferase comprising the use of a model according to any preceding claim. 

15. A ligand identified by a method according to claim 14. 

16. A ligand according to claim 15 that is capable of associating with one or more atomic contacts of a 
galactosyltransferase as shown in Table 1 . 

17. A secondary and three dimensional structure or model of a ligand binding domain of a 
galactosyltransferase that associates with a. diphosphate of a sugar nucleotide donor comprising atomic 
interactions 9, 10, and 11 of Table 1, each atomic interaction defined therein by an atomic contact on 
the diphosphate, and an atomic contact on the galactosyltransferase. 

18. A ligand binding domain of a galactosyltransferase that associates with uracil characterized by the 
following three hydrogen bonds: (1) the amide hydrogen of uracil in position 3 and OD1 of Asp-168 
of the galactosyltransferase, (2) the carbonyl oxygen of uracil in position 4 and the side chain of Lys- 
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204 of the galactosyltransferase, and (3) the carbonyl oxygen of uracil in position 2 and the amide 
hydrogen of the His-2i3 side chain of the galactosyltransferase. 

19. A secondary or three dimensional structure or model of a ligand binding domain of a 
galactosyltransferase that associates with a heterocyclic amine base of a sugar nucleotide donor 
comprising atomic interactions 1, 2 ? 3, and 4 of Table 1, each atomic interaction defined therein by an 
atomic contact on the heterocyclic amine base, and an atomic contact on the galactosyltransferase. 

20. A secondary and three dimensional structure or model of a ligand binding domain of a 
galactosyltransferase that associates with a ribose of a sugar nucleotide donor comprising atomic 
interactions 5, 6, 7, and 8 of Table 1, each atomic interaction defined therein by an atomic contact on 
the sugar, and an atomic contact on the galactosyltransferase. 

21. A secondary or three dimensional structure of a ligand binding domain of a galactosyltransferase that 
associates with UDP comprising atomic interactions 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or 1 1 of Table 1, each 
atomic interaction defined therein by an atomic contact on the nucleotide, and an atomic contact on the 
galactosyltransferase. 

22. A secondary and three dimensional structure or model of a ligand binding domain of a 
galactosyltransferase that associates with UDP-Gal comprising atomic interactions 1 through 11 of 
Table 1, each atomic interaction defined therein by an atomic contact on the UDP of the UDP-Gal, and 
an atomic contact on the galactosyltransferase. 

23. A method of identifying a modulator of a galactosyltransferase or a ligand binding domain thereof 
comprising the step of using the structural coordinates of a galactosysltransferase or a ligand binding 
domain thereof as shown in Table 4 or 8, or a model according to any preceding claim to 
computationally evaluate a test compound for its ability to associate with the galactosyltransferase or 
binding domain or binding site thereof 

24. A method for identifying a potential modulator of a galactosyltransferase by determining binding 
interactions between a test compound and atomic contacts of a ligand binding domain of a 
galactosyltransferase comprising: 

r 

(a) generating the atomic contacts on a computer screen 

(b) generating test compounds with their spatial structure on the computer screen; 

(c) determining whether the compounds associate or interact with the atomic contacts defining the 
galactosyltransferase; and 

(d) identifying test compounds that are potential modulators by their ability to enter into a 
selected number of atomic contacts. 

25. A method for identifying a potential modulator of a galactosyltransferase function by docking a 
computer representation of a test compound with a computer representation of a structure of a 
galactosyltransferase or a ligand binding domain thereof having the amino acid residues of a 
galactosytransferase or a ligand binding domain thereof as shown in Table 1 or Figures 3, 4, or 5. 

26. A method for the design of Hgands for galactosyltransferases based on the three dimensional structure 
of a sugar nucleotide donor or part thereof comprising using the structural coordinates shown in Table 
5, 6, or 7. 



WO 01/83717 



PCT/CA01/00607 



- 187 - 



5 

28. 

10 

29. 
30. 

15 



20 

31. 



A method as claimed in claim 26 comprising (a) generating a computer representation of a sugar 
nucleotide donor, or part thereof, defined by the structural coordinates shown in Table 5, 6, or 7; (b) 
searching for molecules in a data base that are similar to the defined sugar nucleotide donor, or part 
thereof, using a searching computer program, or replacing portions of the compound with similar 
chemical structures from a database using a compound building computer program. 
A method as claimed in claim 27 comprising one or more of the following additional steps: 

(a) testing whether a ligand is a modulator of the activity of a galactosyl transferase in cellular 
assays and animal model assays; 

(b) modifying the ligand; 

(c) optionally rerunning steps (a) or (b); and 

(d) preparing a pharmaceutical composition comprising the modulator. 
A modulator identified by a method of claim 23, 24, 25, or 28. 

Compounds of the formula I having the structural coordinates of uracil of Table 5, preferably Run 9, 
Cluster 1 or ATOM 1 to ATOM 9, inclusive of Table 7: 



wherein R] and R 2 are each independently hydrogen, alkyl, cycloalkyl, alkenyl, alkynyl, heterocyclic 
rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, thioaryl, amino, halogen, carboxylic acid or esters or 
thioesters thereof, amines, sulfate, sulfonic or sulfinic acid or esters thereof, phosphate, pyrophophate, 
gallic acid, phosphonates, thioamide, and -OR 12 where R !2 is alkyl, cycloalkyl, alkenyl, alkynyl, or 
heterocyclic ring; 

Compounds of the following formula II having the structural coordinates of uridine of Table 5, 
preferably Run 9, Cluster 1 or ATOM 1 to 20 inclusive, of Table 7: 
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wherein R,, R 2 , R 3 , R4, and R 5 are each independently hydrogen, alkyl, cycloaikyl, alkenyl, 
alkynyl, heterocyclic rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, thioaryl, amino, halogen, 
carboxylic acid or esters or thioesters thereof, amines, sulfate, sulfonic or sulfmic acid or 
esters thereof, phosphate, pyrophosphate, gallic acid, phosphonates, thioamide, and -OR 12 
* where R 12 is alkyl, cycloalkyl, alkenyl, alkynyl, or heterocyclic ring, 

and salts and optically active and racemic forms of a compound of the formula II. 
32. Compounds of the formula III having the structural coordinates of UDP in Table 5, preferably Run 9, 
Cluster 1, or ATOM 1 to 28 inclusive of Table 7: 



wherein R lf R 2 , R 3 , R*, R& and R )2 are each independently hydrogen, alkyl, cycloalkyl, alkenyl, 
alkynyl, heterocyclic rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, 



O 




33. Compounds of the formula IV having the structural coordinates of UDP-Gal in Table 6, preferably 
Run, Cluster 1 : 
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wherein K u R 2> R 3s R4, R 7 , R 8 , R9, and R 10 are each independently hydrogen, alkyl, cycloalkyl, 
alkenyl, alkynyl, heterocyclic rings, aryl, alkoxy, aryloxy, hydroxyl, thiol, thioaryl, amino, 
halogen, carboxylic acid or esters or thioesters thereof (e.g. -CH 2 OH), amines, sulfate, sulfonic 
or sulfmic acid or esters thereof, phosphate, gallic acid, phosphonates, thioamide, and -OR12 
where R 12 is alkyl, cycloalkyl, alkenyl, alkynyl, or heterocyclic ring, and X is a counter-ion 
including sodium, lithium, potassium, calcium, magnesium, manganese, cobalt ions and the like, 
as well as nontoxic ammonium, quaternary ammonium, and amine cations, preferably Mn 2+ , 
and salts and optically active and racemic forms of a compound of the formula IV. 

34. A pharmaceutical composition comprising a ligand, modulator, or compound according to any preceding 
claim, and a pharmaceutically acceptable carrier, diluent, excipient, or adjuvant or any combination 
thereof. " 

35. A method of treating and/or preventing disease comprising the step of administering a pharmaceutical 
composition according to claim 34 to a mammalian patient. 

36. A method of treating a disease associated with a galactosyltransferase with inappropriate activity in a 
cellular organism, comprising: 

(a) administering a pharmaceutical composition as claimed in claim 34; and 

(b) activating or inhibiting a galactosyltransferase to treat the disease. 

37. Use of a modulator or compound as claimed in any of the preceding claims in the preparation of a 
medicament to treat a disease associated with a galactosyltransferase with inappropriate activity in a 
cellular organism. 

3S. Use of the structural coordinates of a galactosyltransferase structure as shown in Table 1 or 8, or the 
structural coordinates of a ligand as shown in Table 5, 6, or 7 to manufacture a medicament. 

39. A computer for producing a model or three-dimensional representation of a molecule or molecular 
complex, wherein said molecule or molecular complex comprises a galactosyltransferase or ligand 
binding domain thereof defined by structural coordinates of galactosyltransferase amino acids or a ligand 
binding domain thereof, or comprises structural coordinates of atoms of a ligand or substrate, or a three- 
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dimensional representation of a homologue of said molecule or molecular complex, wherein said 
computer comprises: 

(a) a machine-readable data storage medium comprising a data storage material encoded with machine 
readable data wherein said data comprises the structural coordinates of a galactosyltransferase 
amino acids according to Table 4 or 8 or a ligand binding domain thereof, or a ligand according to 
Table 5, 6, or 7; 

(b) a working memory for storing instructions for processing said machine-readable data; 

(c) a central-processing unit coupled to said working memory and to said machine-readable data 
storage medium for processing said machine readable data into said three-dimensional 
representation; and 

(d) a display coupled to said central-processing unit for displaying said three-dimensional 
representation. 
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of the claimed scope is impossible. Independent of the above reasoning, 
the claims also lack clarity (Article 6 PCT). 
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the fact that they have been identified by the method of claims 23-28, 
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international preliminary examination (Rule 66.1(e) PCT). The applicant 
is advised that the £P0 policy when acting as an International 
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preliminary examination on matter which has not been searched. This is . 
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